Video compression

From Wikipedia, the free encyclopedia

Video compression refers to reducing the quantity of data used to represent video images and is a straightforward combination of image compression and motion compensation. This article deals with its applications: compressed video can effectively reduce the bandwidth required to transmit digital video via terrestrial broadcast, via cable, or via satellite services.

Contents

[edit] Video quality

Most video compression is lossy, i.e. it operates on the premise that much of the data present before compression is not necessary for achieving good perceptual quality. For example, DVDs use a video coding standard called MPEG-2 that can compress ~2 hours of video data by 15 to 30 times while still producing a picture quality that is generally considered high quality for standard-definition video. Video compression, like data compression, is a tradeoff between disk space, video quality and the cost of hardware required to decompress the video in a reasonable time. However, if the video is overcompressed in a lossy manner, visible (and sometimes distracting) artifacts can appear.

Video compression typically operates on square-shaped groups of neighboring pixels, often called a macroblock. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec (encode-decode scheme) sends only the differences within those blocks. This works extremely well if the video has no motion. A still frame of text, for example, can be repeated with very little transmitted data. In areas of video with more motion, more pixels change from one frame to the next. When more pixels change, the video compression scheme must send more data to keep up with the larger number of pixels that are changing. If the video content includes an explosion, flames, a flock of thousands of birds, or any other image with a great deal of high frequency detail, the quality will decrease, or the bitrate must be increased to render this added information with the same level of detail.

The programming provider has control over the amount of video compression applied to their video programming before it is sent to their distribution system. DVDs, Blu-ray discs, and HD-DVDs have video compression applied during their mastering process, though Blu-ray and HD-DVD have enough disc capacity that most compression applied in these formats is light when compared to such examples as most video streamed on the internet, or taken on a cellphone. Software used for storing video on hard drives or various optical disc formats will often have a lower image quality, although not in all cases. High bitrate video codecs with little or no compression exist for video post-production work, but create very large files and are thus almost never used for the distribution of finished videos. Once excessive lossy video compression compromises image quality, it is impossible to restore the image to its original quality.

[edit] Theory

Video is basically a three-dimensional array of color pixels. Two dimensions serve as spatial (horizontal and vertical) directions of the moving pictures, and one dimension represents the time domain. A data frame is a set of all pixels that correspond to a single time moment. Basically, a frame is the same as a still picture. (These are sometimes made up of fields. See interlace.)

Video data contains spatial and temporal redundancy. Similarities can thus be encoded by merely registering differences within a frame (spatial) and/or between frames (temporal). Spatial encoding is performed by taking advantage of the fact that the human eye is unable to distinguish small differences in color as easily as it can changes in brightness and so very similar areas of color can be "averaged out" in a similar way to jpeg images (JPEG image compression FAQ, part 1/2). With temporal compression only the changes from one frame to the next are encoded as often a large number of the pixels will be the same on a series of frames.

[edit] Lossless compression

Some forms of data compression are lossless. This means that when the data is decompressed, the result is a bit-for-bit perfect match with the original. While lossless compression of video is possible, it is rarely used, as lossy compression results in far higher compression ratios at an acceptable level of quality, which is the entire point of video compression.

[edit] Intraframe vs interframe compression

One of the most powerful techniques for compressing video is interframe compression. Interframe compression uses one or more earlier or later frames in a sequence to compress the current frame, while intraframe compression uses only the current frame, which is effectively image compression.

The most commonly used method works by comparing each frame in the video with the previous one. If the frame contains areas where nothing has moved, the system simply issues a short command that copies that part of the previous frame, bit-for-bit, into the next one. If sections of the frame move in a simple manner, the compressor emits a (slightly longer) command that tells the decompresser to shift, rotate, lighten, or darken the copy -- a longer command, but still much shorter than intraframe compression. Interframe compression works well for programs that will simply be played back by the viewer, but can cause problems if the video sequence needs to be edited.

Since interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Some video formats, such as DV, compress each frame independently using intraframe compression. Making 'cuts' in intraframe-compressed video is almost as easy as editing uncompressed video -- one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as "I frames" in MPEG-2) aren't allowed to copy data from other frames, and so require much more data than other frames nearby.

It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV to be used for editing. However, this process demands a lot more computing power than editing intraframe compressed video with the same picture quality.

See Editing HDV.

[edit] Current forms

Today, nearly all video compression methods in common use (e.g., those in standards approved by the ITU-T or ISO) apply a discrete cosine transform (DCT) for spatial redundancy reduction. Other methods, such as fractal compression, matching pursuits, and the use of a discrete wavelet transform (DWT) have been the subject of some research, but are typically not used in practical products (except for the use of wavelet coding as still-image coders without motion compensation). Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness to such methods.[citation needed]

[edit] References

[edit] See also

[edit] External links