H.264/MPEG-4 AVC

H.264/MPEG-4 Part 10 or AVC (Advanced Video Coding) is a standard for video compression, and is currently one of the most commonly used formats for the recording, compression, and distribution of high definition video. The final drafting work on the first version of the standard was completed in May 2003.

H.264/MPEG-4 AVC is a block-oriented motion-compensation-based codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) Moving Picture Experts Group (MPEG). It was the product of a partnership effort known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 AVC standard (formally, ISO/IEC 14496-10 – MPEG-4 Part 10, Advanced Video Coding) are jointly maintained so that they have identical technical content.

H.264 is perhaps best known as being one of the codec standards for Blu-ray Discs; all Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming internet sources, such as videos from Vimeo, YouTube, and the iTunes Store, web software such as the Adobe Flash Player and Microsoft Silverlight, broadcast services for DVB and SBTVD, direct-broadcast satellite television services, cable television services, and real-time videoconferencing.

Contents

Overview

The intent of the H.264/AVC project was to create a standard capable of providing good video quality at substantially lower bit rates than previous standards (i.e., half or less the bit rate of MPEG-2, H.263, or MPEG-4 Part 2), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement. An additional goal was to provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems, including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems.

The H.264 standard can be viewed as a "family of standards", the members of which are the profiles described below. A specific decoder decodes at least one, but not necessarily all profiles. The decoder specification describes which of the profiles can be decoded.

The H.264 name follows the ITU-T naming convention, where the standard is a member of the H.26x line of VCEG video coding standards; the MPEG-4 AVC name relates to the naming convention in ISO/IEC MPEG, where the standard is part 10 of ISO/IEC 14496, which is the suite of standards known as MPEG-4. The standard was developed jointly in a partnership of VCEG and MPEG, after earlier development work in the ITU-T as a VCEG project called H.26L. It is thus common to refer to the standard with names such as H.264/AVC, AVC/H.264, H.264/MPEG-4 AVC, or MPEG-4/H.264 AVC, to emphasize the common heritage. Occasionally, it is also referred to as "the JVT codec", in reference to the Joint Video Team (JVT) organization that developed it. (Such partnership and multiple naming is not uncommon. For example, the video codec standard known as MPEG-2 also arose from the partnership between MPEG and the ITU-T, where MPEG-2 video is known to the ITU-T community as H.262.[1]) Some software programs (such as VLC media player) internally identify this standard as AVC1.

The standardization of the first version of H.264/AVC was completed in May 2003. In the first project to extend the original standard, the JVT then developed what was called the Fidelity Range Extensions (FRExt). These extensions enabled higher quality video coding by supporting increased sample bit depth precision and higher-resolution color information, including sampling structures known as Y'CbCr 4:2:2 (=YUV 4:2:2) and Y'CbCr 4:4:4. Several other features were also included in the Fidelity Range Extensions project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-specified perceptual-based quantization weighting matrices, efficient inter-picture lossless coding, and support of additional color spaces. The design work on the Fidelity Range Extensions was completed in July 2004, and the drafting work on them was completed in September 2004.

Further recent extensions of the standard then included adding five other new profiles intended primarily for professional applications, adding extended-gamut color space support, defining additional aspect ratio indicators, defining two additional types of "supplemental enhancement information" (post-filter hint and tone mapping), and deprecating one of the prior FRExt profiles that industry feedback indicated should have been designed differently.

The next major feature added to the standard was Scalable Video Coding (SVC). Specified in Annex G of H.264/AVC, SVC allows the construction of bitstreams that contain sub-bitstreams that also conform to the standard, including one such bitstream known as the "base layer" that can be decoded by an H.264/AVC that does not support SVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. In this case, high-level syntax and inter prediction reference pictures in the bitstream are constructed accordingly. For spatial and quality bitstream scalability, i.e. the presence of a sub-bitstream with lower spatial resolution or quality than the bitstream, NAL (Network Abstraction Layer) removed from the bitstream when deriving the sub-bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality signal by data of the lower spatial resolution or quality signal, is typically used for efficient coding. The Scalable Video Coding extensions were completed in November 2007.

The next major feature added to the standard was Multiview Video Coding (MVC). Specified in Annex H of H.264/AVC, MVC enables the construction of bitstreams that represent more than one view of a video scene. An important example of this functionality is stereoscopic 3D video coding. Two profiles were developed in the MVC work: Multiview High Profile supports an arbitrary number of views, and Stereo High Profile is designed specifically for two-view stereoscopic video. The Multiview Video Coding extensions were completed in November 2009.

Standardization committee and history

In early 1998, the Video Coding Experts Group (VCEG – ITU-T SG16 Q.6) issued a call for proposals on a project called H.26L, with the target to double the coding efficiency (which means halving the bit rate necessary for a given level of fidelity) in comparison to any other existing video coding standards for a broad variety of applications. VCEG was chaired by Gary Sullivan (Microsoft, formerly PictureTel, USA). The first draft design for that new standard was adopted in August 1999. In 2000, Thomas Wiegand (Heinrich Hertz Institute, Germany) became VCEG co-chair. In December 2001, VCEG and the Moving Picture Experts Group (MPEGISO/IEC JTC 1/SC 29/WG 11) formed a Joint Video Team (JVT), with the charter to finalize the video coding standard. Formal approval of the specification came in March 2003. The JVT was (is) chaired by Gary Sullivan, Thomas Wiegand, and Ajay Luthra (Motorola, USA). In June 2004, the Fidelity range extensions (FRExt) project was finalized. From January 2005 to November 2007, the JVT was working on an extension of H.264/AVC towards scalability by an Annex (G) called Scalable Video Coding (SVC). The JVT management team was extended by Jens-Rainer Ohm (Aachen University, Germany). From July 2006 to November 2009, the JVT worked on Multiview Video Coding (MVC), an extension of H.264/AVC towards free viewpoint television and 3D television. That work included the development of two new profiles of the standard: the Multiview High Profile and the Stereo High Profile.

Applications

The H.264 video format has a very broad application range that covers all forms of digital compressed video from low bit-rate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. With the use of H.264, bit rate savings of 50%[2] or more are reported. For example, H.264 has been reported to give the same Digital Satellite TV quality as current MPEG-2 implementations with less than half the bitrate, with current MPEG-2 implementations working at around 3.5 Mbit/s and H.264 at only 1.5 Mbit/s.[3] To ensure compatibility and problem-free adoption of H.264/AVC, many standards bodies have amended or added to their video-related standards so that users of these standards can employ H.264/AVC.

Both the Blu-ray Disc format and the now-discontinued HD DVD format include the H.264/AVC High Profile as one of 3 mandatory video compression formats.

The Digital Video Broadcast project (DVB) approved the use of H.264/AVC for broadcast television in late 2004.

The Advanced Television Systems Committee (ATSC) standards body in the United States approved the use of H.264/AVC for broadcast television in July 2008, although the standard is not yet used for fixed ATSC broadcasts within the United States.[4][5] It has also been approved for use with the more recent ATSC-M/H (Mobile/Handheld) standard, using the AVC and SVC portions of H.264.[6]

AVCHD is a high-definition recording format designed by Sony and Panasonic that uses H.264 (conforming to H.264 while adding additional application-specific features and constraints).

AVC-Intra is an intraframe-only compression format, developed by Panasonic.

The CCTV (Closed Circuit TV) and Video Surveillance markets have included the technology in many products.

Canon DSLRs use the H.264 QuickTime MOV as the native recording.

Patent licensing

In countries where patents on software algorithms are upheld, vendors and commercial users of products that use H.264/AVC are expected to pay patent licensing royalties for the patented technology[7] that their products use. This applies to the Baseline Profile as well.[8] A private organization known as MPEG LA, which is not affiliated in any way with the MPEG standardization organization, administers the licenses for patents applying to this standard, as well as the patent pools for MPEG-2 Part 1 Systems, MPEG-2 Part 2 Video, MPEG-4 Part 2 Video, and other technologies. The MPEG-LA patents in the US last at least until 2027.[9]

On August 26, 2010 MPEG LA announced that H.264 encoded internet video that is free to end users will never be charged for royalties.[10] All other royalties will remain in place such as the royalties for products that decode and encode H.264 video.[11] The license terms are updated in 5-year blocks.[12]

In 2005, Qualcomm, which was the assignee of U.S. Patent 5,452,104 and U.S. Patent 5,576,767, sued Broadcom in US District Court, alleging that Broadcom infringed the two patents by making products that were compliant with the H.264 video compression standard.[13] In 2007, the District Court found that the patents were unenforceable because Qualcomm had failed to disclose them to the JVT prior to the release of the H.264 standard in May 2003.[13] In December 2008, the US Court of Appeals for the Federal Circuit affirmed the District Court's order that the patents be unenforceable but remanded to the District Court with instructions to limit the scope of unenforceability to H.264 compliant products.[13]

Controversies

Controversies surrounding the H.264 video compression standard stem primarily from its use within the HTML5 Internet standard. HTML5 adds two new tags to the HTML standard: <video> and <audio> for direct embedding of video and audio content to a web page. HTML5 is being developed by the HTML5 working group as an open standard to be adopted by all web browser developers. In 2009, the HTML5 working group was split between supporters of Ogg Theora, a free video format whose developers believe is unencumbered by patents, and H.264 which contains patented technology. As late as July 2009, Google and Apple were said to support H.264, while Mozilla and Opera support Ogg Theora.[14] Microsoft, with the release of Internet Explorer 9, has added support for both HTML 5 and H.264. Microsoft CEO Steve Ballmer at the Gartner Symposium/ITXpo in November, 2010, in answer to the question, "HTML 5 or Silverlight?" said, "If you want to do something that is universal, there is no question the world is going HTML5."[15] However, in January 2011, Google announced that they were pulling support for H.264 from their Chrome browser and supporting both Theora and WebM/VP8 to use only open formats.[16]

Features

H.264/AVC/MPEG-4 Part 10 contains a number of new features that allow it to compress video much more effectively than older standards and to provide more flexibility for application to a wide variety of network environments. In particular, some such key features include:

These techniques, along with several others, help H.264 to perform significantly better than any prior standard under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less, especially on high bit rate and high resolution situations.[19]

Like other ISO/IEC MPEG video standards, H.264/AVC has a reference software implementation that can be freely downloaded.[20] Its main purpose is to give examples of H.264/AVC features, rather than being a useful application per se. Some reference hardware design work is also under way in the Moving Picture Experts Group. The above mentioned are complete features of H.264/AVC covering all profiles of H.264. A profile for a codec is a set of features of that codec identified to meet a certain set of specifications of intended applications. This means that many of the features listed are not supported in some profiles. Various profiles of H.264/AVC are discussed in next section.

Profiles

The standard defines 18 sets of capabilities, which are referred to as profiles, targeting specific classes of applications.

Profiles for non-scalable 2D video applications include the following:

Constrained Baseline Profile (CBP)
Primarily for low-cost applications, this profile is most typically used in videoconferencing and mobile applications. It corresponds to the subset of features that are in common between the Baseline, Main, and High Profiles described below.
Baseline Profile (BP)
Primarily for low-cost applications that require additional data loss robustness, this profile is used in some videoconferencing and mobile applications. This profile includes all features that are supported in the Constrained Baseline Profile, plus three additional features that can be used for loss robustness (or for other purposes such as low-delay multi-point video stream compositing). The importance of this profile has faded somewhat since the definition of the Constrained Baseline Profile in 2009. All Constrained Baseline Profile bitstreams are also considered to be Baseline Profile bitstreams, as these two profiles share the same profile identifier code value.
Main Profile (MP)
This profile is used for standard-definition digital TV broadcasts that use the MPEG-4 format as defined in the DVB standard.[21] It is not, however, used for high-definition television broadcasts, as the importance of this profile faded when the High Profile was developed in 2004 for that application.
Extended Profile (XP)
Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.
High Profile (HiP)
The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (for example, this is the profile adopted by the Blu-ray Disc storage format and the DVB HDTV broadcast service).
Progressive High Profile (PHiP)
Similar to the High profile, but without support of field coding features.
High 10 Profile (Hi10P)
Going beyond typical mainstream consumer product capabilities, this profile builds on top of the High Profile, adding support for up to 10 bits per sample of decoded picture precision.
High 4:2:2 Profile (Hi422P)
Primarily targeting professional applications that use interlaced video, this profile builds on top of the High 10 Profile, adding support for the 4:2:2 chroma subsampling format while using up to 10 bits per sample of decoded picture precision.
High 4:4:4 Predictive Profile (Hi444PP)
This profile builds on top of the High 4:2:2 Profile, supporting up to 4:4:4 chroma sampling, up to 14 bits per sample, and additionally supporting efficient lossless region coding and the coding of each picture as three separate color planes.

For camcorders, editing, and professional applications, the standard contains four additional Intra-frame-only profiles, which are defined as simple subsets of other corresponding profiles. These are mostly for professional (e.g., camera and editing system) applications:

High 10 Intra Profile
The High 10 Profile constrained to all-Intra use.
High 4:2:2 Intra Profile
The High 4:2:2 Profile constrained to all-Intra use.
High 4:4:4 Intra Profile
The High 4:4:4 Profile constrained to all-Intra use.
CAVLC 4:4:4 Intra Profile
The High 4:4:4 Profile constrained to all-Intra use and to CAVLC entropy coding (i.e., not supporting CABAC).

As a result of the Scalable Video Coding (SVC) extension, the standard contains three additional scalable profiles, which are defined as a combination of a H.264/AVC profile for the base layer (identified by the second word in the scalable profile name) and tools that achieve the scalable extension:

Scalable Baseline Profile
Primarily targeting video conferencing, mobile, and surveillance applications, this profile builds on top of a constrained version of the H.264/AVC Baseline profile to which the base layer (a subset of the bitstream) must conform. For the scalability tools, a subset of the available tools is enabled.
Scalable High Profile
Primarily targeting broadcast and streaming applications, this profile builds on top of the H.264/AVC High Profile to which the base layer must conform.
Scalable High Intra Profile
Primarily targeting production applications, this profile is the Scalable High Profile constrained to all-Intra use.

As a result of the Multiview Video Coding (MVC) extension, the standard contains two multiview profiles:

Stereo High Profile
This profile targets two-view stereoscopic 3D video and combines the tools of the High profile with the inter-view prediction capabilities of the MVC extension.
Multiview High Profile
This profile supports two or more views using both inter-picture (temporal) and MVC inter-view prediction, but does not support field pictures and macroblock-adaptive frame-field coding.
Feature support in particular profiles
Feature CBP BP XP MP HiP Hi10P Hi422P Hi444PP
I and P slices Yes Yes Yes Yes Yes Yes Yes Yes
Chroma formats 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0/4:2:2 4:2:0/4:2:2/4:4:4
Sample depths (bits) 8 8 8 8 8 8 to 10 8 to 10 8 to 14
Flexible macroblock ordering (FMO) No Yes Yes No No No No No
Arbitrary slice ordering (ASO) No Yes Yes No No No No No
Redundant slices (RS) No Yes Yes No No No No No
Data Partitioning No No Yes No No No No No
SI and SP slices No No Yes No No No No No
B slices No No Yes Yes Yes Yes Yes Yes
Interlaced coding (PicAFF, MBAFF) No No Yes Yes Yes Yes Yes Yes
Multiple reference frames Yes Yes Yes Yes Yes Yes Yes Yes
In-loop deblocking filter Yes Yes Yes Yes Yes Yes Yes Yes
CAVLC entropy coding Yes Yes Yes Yes Yes Yes Yes Yes
CABAC entropy coding No No No Yes Yes Yes Yes Yes
8×8 vs. 4×4 transform adaptivity No No No No Yes Yes Yes Yes
Quantization scaling matrices No No No No Yes Yes Yes Yes
Separate Cb and Cr QP control No No No No Yes Yes Yes Yes
Monochrome (4:0:0) No No No No Yes Yes Yes Yes
Separate color plane coding No No No No No No No Yes
Predictive lossless coding No No No No No No No Yes

Levels

As the term is used in the standard, a "level" is a specified set of constraints indicating a degree of required decoder performance for a profile. For example, a level of support within a profile will specify the maximum picture resolution, frame rate, and bit rate that a decoder may be capable of using. A decoder that conforms to a given level is required to be capable of decoding all bitstreams that are encoded for that level and for all lower levels.

Levels with maximum property values
Level Max macroblocks Max video bit rate (video coding layer – VCL) Examples for high resolution @
frame rate
(max stored frames)
per second per frame BP, XP, MP
(kbit/s)
HiP
(kbit/s)
Hi10P
(kbit/s)
Hi422P, Hi444PP
(kbit/s)
1 1,485 99 64 80 192 256 128×96@30.9 (8)
176×144@15.0 (4)
1b 1,485 99 128 160 384 512 128×96@30.9 (8)
176×144@15.0 (4)
1.1 3,000 396 192 240 576 768 176×144@30.3 (9)
320×240@10.0 (3)
352×288@7.5 (2)
1.2 6,000 396 384 480 1,152 1,536 320×240@20.0 (7)
352×288@15.2 (6)
1.3 11,880 396 768 960 2,304 3,072 320×240@36.0 (7)
352×288@30.0 (6)
2 11,880 396 2,000 2,500 6,000 8,000 320×240@36.0 (7)
352×288@30.0 (6)
2.1 19,800 792 4,000 5,000 12,000 16,000 352×480@30.0 (7)
352×576@25.0 (6)
2.2 20,250 1,620 4,000 5,000 12,000 16,000 352×480@30.7(10)
352×576@25.6 (7)
720×480@15.0 (6)
720×576@12.5 (5)
3 40,500 1,620 10,000 12,500 30,000 40,000 352×480@61.4 (12)
352×576@51.1 (10)
720×480@30.0 (6)
720×576@25.0 (5)
3.1 108,000 3,600 14,000 17,500 42,000 56,000 720×480@80.0 (13)
720×576@66.7 (11)
1280×720@30.0 (5)
3.2 216,000 5,120 20,000 25,000 60,000 80,000 1,280×720@60.0 (5)
1,280×1,024@42.2 (4)
4 245,760 8,192 20,000 25,000 60,000 80,000 1,280×720@68.3 (9)
1,920×1,080@30.1 (4)
2,048×1,024@30.0 (4)
4.1 245,760 8,192 50,000 62,500 150,000 200,000 1,280×720@68.3 (9)
1,920×1,080@30.1 (4)
2,048×1,024@30.0 (4)
4.2 522,240 8,704 50,000 62,500 150,000 200,000 1,920×1,080@64.0 (4)
2,048×1,080@60.0 (4)
5 589,824 22,080 135,000 168,750 405,000 540,000 1,920×1,080@72.3 (13)
2,048×1,024@72.0 (13)
2,048×1,080@67.8 (12)
2,560×1,920@30.7 (5)
3,680×1,536@26.7 (5)
5.1 983,040 36,864 240,000 300,000 720,000 960,000 1,920×1,080@120.5 (16)
4,096×2,048@30.0 (5)
4,096×2,304@26.7 (5)

Decoded picture buffering

Previously-encoded pictures are used by H.264/AVC encoders to provide predictions of the values of samples in other pictures. This allows the encoder to make efficient decisions on the best way to encode a given picture. At the decoder, such pictures are stored in a virtual decoded picture buffer (DPB). The maximum capacity of the DPB is in units of frames (or pairs of fields), as shown in parentheses in the right column of the table above, can be computed as follows:

Standard equation Min(Floor(MaxDpbMbs / (PicWidthInMbs * FrameHeightInMbs)), 16)
Excel-compatible formula =MIN(FLOOR(MaxDpbMbs / (PicWidthInMbs * FrameHeightInMbs); 1); 16)

Where MaxDpbMbs is a constant value provided in the table below as a function of level number, and PicWidthInMbs and FrameHeightInMbs are the picture width and frame height for the coded video data, expressed in units of macroblocks (rounded up to integer values and accounting for cropping and macroblock pairing when applicable). This formula is specified in sections A.3.1.h and A.3.2.f of the 2009 edition of the standard.

Level
1
1b
1.1
1.2
1.3
2
2.1
2.2
3
3.1
3.2
4
4.1
4.2
5
5.1
MaxDpbMbs
396
396
900
2,376
2,376
2,376
4,752
8,100
8,100
18,000
20,480
32,768
32,768
34,816
110,400
184,320

For example, for an HDTV picture that is 1920 samples wide (PicWidthInMbs = 120) and 1080 samples high (FrameHeightInMbs = 68), a Level 4 decoder has a maximum DPB storage capacity of Floor(32768/(120*68)) = 4 frames (or 8 fields) when encoded with minimal cropping parameter values. Thus, the value 4 is shown in parentheses in the table above in the right column of the row for Level 4 with the frame size 1920×1080.

It is important to note that the current picture being decoded is not included in the computation of DPB fullness (unless the encoder has indicated for it to be stored for use as a reference for decoding other pictures or for delayed output timing). Thus, a decoder needs to actually have sufficient memory to handle (at least) one frame more than the maximum capacity of the DPB as calculated above.

Versions

Versions of the H.264/AVC standard include the following completed revisions, corrigenda, and amendments (dates are final approval dates in ITU-T, while final "International Standard" approval dates in ISO/IEC are somewhat different and slightly later in most cases). Each version represents changes relative to the next lower version that is integrated into the text. Bold faced versions are published (or planned to be published).

Software encoder feature comparison

AVC software implementations
Feature QT Nero LEAD x264 MainConcept Elecard TSE VSofts ProCoder Avivo Elemental IPP
I and P slices Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
B slices Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes
SI and SP slices No No No No No No No No No No No No
Multiple reference frames Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes
In-loop deblocking filter Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Flexible Macroblock Ordering (FMO) No No No No No No No Yes No No No No
Arbitrary slice ordering (ASO) No No No No No No No No No No No No
Redundant slices (RS) No No No No No No No No No No No No
Data partitioning No No No No No No No No No No No No
Interlaced coding (PicAFF, MBAFF) No MBAFF MBAFF MBAFF Yes Yes No MBAFF Yes MBAFF Yes No
CAVLC entropy coding Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
CABAC entropy coding Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes
8×8 vs. 4×4 transform adaptivity No Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes
Quantization scaling matrices No No No Yes Yes No No Yes No No No No
Separate Cb and Cr QP control No No No Yes Yes Yes No Yes No No No No
Monochrome (4:0:0) No No No No No No No Yes No No No No
Chroma formats (4:2:x) 0 0 0 0, 2[22], 4:4:4[23] 0, 2 0 0, 2 0, 2, 4:4:4 0 0 0 0
Largest sample depth (bit) 8 8 8 10[24] 10 8 8 10 8 8 8 12
Separate color plane coding No No No No No No No No No No No No
Predictive lossless coding No No No Yes[25] No No No No No No No No
Film grain modeling No No No No No No No Yes No No No No
Fully supported profiles
Profile QT Nero LEAD x264 MainConcept Elecard TSE VSofts ProCoder Avivo Elemental IPP
Constrained baseline Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Baseline No No No No No No No No No No No No
Extended No No No No No No No No No No No No
Main No Yes/No Yes/No Yes Yes/No Yes No Yes/No Yes No Yes No
High No No No No No No No No No No No No

Hardware-based encoding and decoding

Because H.264 encoding and decoding requires significant computing power in specific types of arithmetic operations, software implementations that run on general-purpose CPUs are typically less power efficient. However, the latest quad-core general-purpose x86 CPUs have sufficient computation power to perform real-time SD and HD encoding. Compression efficiency depends on video algorithmic implementations, not on whether hardware or software implementation is used. Therefore, the difference between hardware and software based implementation is more on power-efficiency, flexibility and cost. To improve the power efficiency and reduce hardware form-factor, special-purpose hardware may be employed, either for the complete encoding or decoding process, or for acceleration assistance within a CPU-controlled environment.

CPU based solutions are known to be much more flexible, particularly when encoding must be done concurrently in multiple formats, multiple bit rates and resolutions (multi-screen), and possibly with additional features on container format support, advanced integrated advertising features, etc. CPU based software solution generally makes it much easier to load balance multiple concurrent encoding sessions within the same CPU.

The 2nd generation Intel Core i processors i3/i5/i7 (code named "Sandy Bridge") introduced at the January 2011 CES (Consumer Electronics Show) offer an on-chip hardware full HD H.264 encoder.[26] The Intel marketing name for the on-chip H.264 encoder feature is "Intel® Quick Sync Video".[27]

A hardware H.264 encoder can be an ASIC or an FPGA. An FPGA is a general programmable chip. To use an FPGA as a hardware encoder, an H.264 encoder design is required to customize the chip for the application. A full HD H.264 encoder could run on a single low cost FPGA chip by 2009 (High profile, level 4.1, 1080p, 30fps).

ASIC encoders with H.264 encoder functionality are available from many different semiconductor companies, but the core design used in the ASIC is typically licensed from one of a few companies such as Chips&Media, On2 (formerly Hantro, acquired by Google), Imagination. Some companies have both FPGA and ASIC product offerings.[28]

Texas Instruments manufactures a line of ARM + DSP cores that perform DSP H264 BP encoding 1080p at 30fps.[29] This permits flexibility with respect to codecs (which are implemented as highly optimized DSP code) while being more efficient than software on a generic CPU.

See also

References

  1. ^ "H.262 : Information technology — Generic coding of moving pictures and associated audio information: Video". http://itu.int/rec/T-REC-H.262. Retrieved 2007-04-15. 
  2. ^ "H.264 Joint Video Surveillance Group Compression Research Data: 2008". Jvsg.com. http://www.jvsg.com/. Retrieved 2010-05-17. 
  3. ^ Wenger, et al.. RFC 3984 : RTP Payload Format for H.264 Video. p. 2. http://tools.ietf.org/html/rfc3984#page-2. 
  4. ^ "ATSC Standard A/72 Part 1: Video System Characteristics of AVC in the ATSC Digital Television System" (PDF). http://www.atsc.org/cms/standards/a_72_part_1.pdf. Retrieved 2011-07-30. 
  5. ^ "ATSC Standard A/72 Part 2: AVC Video Transport Subsystem Characteristics" (PDF). http://www.atsc.org/cms/standards/a_72_part_2.pdf. Retrieved 2011-07-30. 
  6. ^ "ATSC Standard A/153 Part 7: AVC and SVC Video System Characteristics" (PDF). http://atsc.org/cms/standards/a153/a_153-Part-7-2009.pdf. Retrieved 2011-07-30. 
  7. ^ "Summary of AVC/H.264 License Terms". http://www.mpegla.com/main/programs/AVC/Documents/AVC_TermsSummary.pdf. Retrieved 2010-03-25. 
  8. ^ "OMS Video, A Project of Sun's Open Media Commons Initiative". http://blogs.sun.com/openmediacommons/entry/oms_video_a_project_of. Retrieved 2008-08-26. 
  9. ^ http://www.osnews.com/story/24954/US_Patent_Expiration_for_MP3_MPEG-2_H_264 has a MPEG-LA patent US 7826532 that was filed in Sep. 5, 2003 and has a 1546 day term extension. http://patft1.uspto.gov/netacgi/nph-Parser?patentnumber=7826532 http://www.google.com/patents/about?id=2onYAAAAEBAJ
  10. ^ "MPEG LA’s AVC License Will Not Charge Royalties for Internet Video that is Free to End Users through Life of License". MPEG LA. 2010-08-26. http://www.mpegla.com/Lists/MPEG%20LA%20News%20List/Attachments/231/n-10-08-26.pdf. Retrieved 2010-08-26. 
  11. ^ "MPEG LA Cuts Royalties from Free Web Video, Forever". pcmag.com. 2010-08-26. http://www.pcmag.com/article2/0,2817,2368359,00.asp. Retrieved 2010-08-26. 
  12. ^ "AVC FAQ". Mpeg La. 2002-08-01. http://www.mpegla.com/main/programs/AVC/Pages/FAQ.aspx. Retrieved 2010-05-17. 
  13. ^ a b c See Qualcomm Inc. v. Broadcom Corp., No. 2007-1545, 2008-1162 (Fed. Cir. Dec. 1, 2008). For articles in the popular press, see signonsandiego.com, "Qualcomm loses its patent-rights case" and "Qualcomm's patent case goes to jury"; and bloomberg.com "Broadcom Wins First Trial in Qualcomm Patent Dispute"
  14. ^ "Decoding the HTML 5 video codec debate". Ars Technica. 2009-07-06. http://arstechnica.com/open-source/news/2009/07/decoding-the-html-5-video-codec-debate.ars. Retrieved 2011-01-12. 
  15. ^ "Steve Ballmer, CEO Microsoft, interviewed at Gartner Symposium/ITxpo Orlando 2010". Gartnervideo. 2010-11. http://www.youtube.com/watch?v=iI47b3a9cEI. Retrieved 2011-01-12. 
  16. ^ "HTML Video Codec Support in Chrome". 2011-01-11. http://blog.chromium.org/2011/01/html-video-codec-support-in-chrome.html. Retrieved 2011-01-12. 
  17. ^ "The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions" (PDF). http://www.fastvdo.com/spie04/spie04-h264OverviewPaper.pdf. Retrieved 2011-07-30. 
  18. ^ a b c RFC 3984, p.3
  19. ^ Apple Inc. (1999-03-26). "H.264 FAQ". Apple. http://www.apple.com/quicktime/technologies/h264/faq.html. Retrieved 2010-05-17. 
  20. ^ Karsten Suehring. "H.264/AVC JM Reference Software Download". Iphome.hhi.de. http://iphome.hhi.de/suehring/tml/download/. Retrieved 2010-05-17. 
  21. ^ "TS 101 154 – V1.9.1 – Digital Video Broadcasting (DVB); Specification for the use of Video and Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream" (PDF). http://www.etsi.org/deliver/etsi_ts/101100_101199/101154/01.09.01_60/ts_101154v010901p.pdf. Retrieved 2010-05-17. 
  22. ^ "x264 4:2:2 encoding support", Retrieved 2011-09-22.
  23. ^ "x264 4:4:4 encoding support", Retrieved 2011-06-22.
  24. ^ "x264 support for 9 and 10-bit encoding", Retrieved 2011-06-22.
  25. ^ "x264 replace High 4:4:4 profile lossless with High 4:4:4 Predictive", Retrieved 2011-06-22.
  26. ^ "Quick Reference Guide to generation Intel® Core™ Processor Built-in Visuals – Intel® Software Network". software.intel.com. 2010-10-01. http://software.intel.com/en-us/articles/quick-reference-guide-to-intel-integrated-graphics/. Retrieved 2011-01-19. 
  27. ^ "Intel® Quick Sync Video". www.intel.com. 2010-10-01. http://www.intel.com/content/www/us/en/architecture-and-technology/quick-sync-video/quick-sync-video-general.html. Retrieved 2011-01-19. 
  28. ^ "Design-reuse.com". Design-reuse.com. 1990-01-01. http://www.design-reuse.com/sip/?q=H.264+encoder. Retrieved 2010-05-17. 
  29. ^ "Category:DM6467 - Texas Instruments Embedded Processors Wiki". Processors.wiki.ti.com. 2011-07-12. http://processors.wiki.ti.com/index.php/Category:DM6467. Retrieved 2011-07-30. 

Further reading

External links