Video compression

From Wikipedia, the free encyclopedia

Video compression refers to reducing the quantity of data used to represent video content without excessively reducing the quality of the picture. It also reduces the number of bits required to store and/or transmit digital media. Compressed video can be transmitted more economically over a smaller carrier.

Digital video requires high data rates - the better the picture, the more data is ordinarily needed. This means powerful hardware, and lots of bandwidth when video is transmitted. However much of the data in video is not necessary for achieving good perceptual quality, e.g., because it can be easily predicted - for example, successive frames in a movie rarely change much from one to the next - this makes data compression work well with video. Video compression can make video files far smaller with little perceptible loss in quality. For example, DVDs use a video coding standard called MPEG-2 that makes the movie 15 to 30 times smaller while still producing a picture quality that is generally considered high quality for standard-definition video. Without proper use of data compression techniques, either the picture would look much worse or the movie would require much more disk space.

Contents

[edit] Theory

Video is basically a three-dimensional array of color pixels. Two dimensions serve as spatial (horizontal and vertical) directions of the moving pictures, and one dimension represents the time domain. A frame is a set of all pixels that correspond to a single point in time. Basically, a frame is the same as a still picture. (These are sometimes made up of fields. See interlace)

Video data contains spatial and temporal redundancy. Similarities can thus be encoded by merely registering differences within a frame (spatial) and/or between frames (temporal). Spatial encoding is performed by taking advantage of the fact that the human eye is unable to distinguish small differences in colour as easily as it can changes in brightness and so very similar areas of colour can be "averaged out" in a similar way to jpeg images (JPEG image compression FAQ, part 1/2). With temporal compression only the changes from one frame to the next are encoded as often a large number of the pixels will be the same on a series of frames (About video compression).

[edit] Lossless compression

Some forms of data compression are lossless. This means that when the data is decompressed, the result is a bit-for-bit perfect match with the original. While lossless compression of video is possible, it is rarely used. This is because any lossless compression system will sometimes result in a file (or portions of) that is as large and/or has the same data rate as the uncompressed original. As a result, all hardware in a lossless system would have to be able to run fast enough to handle uncompressed video as well. This eliminates much of the benefit of compressing the data in the first place. For example, digital videotape can't vary its data rate easily so dealing with short bursts of maximum-data-rate video would be more complicated than something that was fixed at the maximum rate all the time.

[edit] Intraframe vs interframe compression

One of the most powerful techniques for compressing video is interframe compression. Interframe compression uses one or more earlier or later frames in a sequence to compress the current frame, while intraframe compression uses only the current frame, which is effectively image compression.

One method is to expand the discrete cosine transform from 2 to 3 dimensions.[citation needed] The most commonly used method works by comparing each frame in the video with the previous one. If the frame contains areas where nothing has moved, the system simply issues a short command that copies that part of the previous frame, bit-for-bit, into the next one. If sections of the frame move in a simple manner, the compressor emits a (slightly longer) command that tells the decompresser to shift, rotate, lighten, or darken the copy -- a longer command, but still much shorter than intraframe compression. Interframe compression works well for programs that will simply be played back by the viewer, but can cause problems if the video sequence needs to be edited.

Since interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Some video formats, such as DV, compress each frame independently using intraframe compression. Making 'cuts' in intraframe-compressed video is almost as easy as editing uncompressed video -- one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as "I frames" in MPEG-2) aren't allowed to copy data from other frames, and so require much more data than other frames nearby.

It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV to be used for editing. However, this process demands a lot more computing power than editing intraframe compressed video with the same picture quality.

See Editing HDV.

[edit] Current forms

Today, nearly all video compression methods in common use (e.g., those in standards approved by the ITU-T or ISO) apply a discrete cosine transform (DCT) for spatial redundancy reduction. Other methods, such as fractal compression, matching pursuits, and the use of a discrete wavelet transform (DWT) have been the subject of some research, but are typically not used in practical products (except for the use of wavelet coding as still-image coders without motion compensation). Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness to such methods.

The use of most video compression techniques (e.g., DCT or DWT based techniques) involves quantization. The quantization can either be scalar quantization or vector quantization; however, nearly all practical designs use scalar quantization because of its greater simplicity.

In broadcast engineering, digital television (DVB, ATSC and ISDB ) is made practical by video compression. TV stations can broadcast not only HDTV, but multiple virtual channels on the same physical channel as well. It also conserves precious bandwidth on the radio spectrum. Nearly all digital video broadcast today uses the MPEG-2 standard video compression format, although H.264/MPEG-4 AVC and VC-1 are emerging contenders in that domain.

[edit] See also

[edit] External links