Lossy compression

From Wikipedia, the free encyclopedia

Original Image (lossless PNG, 60.1 KiB size) — uncompressed is 108.5 KiB

Low compression (84% less information than uncompressed PNG, 9.37 KiB)

Medium compression (92% less information than uncompressed PNG, 4.82 KiB)

High compression (98% less information than uncompressed PNG, 1.14 KiB)

A lossy compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way. Lossy compression is most commonly used to compress multimedia data (audio, video, still images), especially in applications such as streaming media and internet telephony. On the other hand lossless compression is required for text and data files, such as bank records, text articles, etc.

Lossy compression formats suffer from generation loss: repeatedly compressing and decompressing the file will cause it to progressively lose quality. This is in contrast with lossless data compression.

Information-theoretical foundations for lossy data compression are provided by rate-distortion theory. Much like the use of probability in optimal coding theory, rate-distortion theory heavily draws on Bayesian estimation and decision theory in order to model perceptual distortion and even aesthetic judgment.

1 Types
2 Lossy versus lossless
3 Methods
4 See also
5 Notes
6 External links

[edit] Types

There are two basic lossy compression schemes:

In lossy transform codecs, samples of picture or sound are taken, chopped into small segments, transformed into a new basis space, and quantized. The resulting quantized values are then entropy coded.

In lossy predictive codecs, previous and/or subsequent decoded data is used to predict the current sound sample or image frame. The error between the predicted data and the real data, together with any extra information needed to reproduce the prediction, is then quantized and coded.

In some systems the two techniques are combined, with transform codecs being used to compress the error signals generated by the predictive stage.

[edit] Lossy versus lossless

The advantage of lossy methods over lossless methods is that in some cases a lossy method can produce a much smaller compressed file than any known lossless method, while still meeting the requirements of the application.

Lossy methods are most often used for compressing sound, images or videos. This is because these types of data are intended for human interpretation where the mind can easily "fill in the blanks" or see past very minor errors or inconsistencies. The compression ratio (that is, the size of the compressed file compared to that of the uncompressed file) of lossy video codecs is nearly always far superior to that of the audio and still-image equivalents. Audio can often be compressed at 10:1 with imperceptible loss of quality, and video can be compressed immensely (e.g. 300:1) with little visible quality loss. Lossily compressed still images are often compressed to 1/10th their original size, as with audio, but the quality loss is more noticeable, especially on closer inspection.

When a user acquires a lossily compressed file, (for example, to reduce download-time) the retrieved file can be quite different from the original at the bit level while being indistinguishable to the human ear or eye for most practical purposes. Many methods focus on the idiosyncrasies of the human physiology, taking into account, for example, that the human eye can see only certain wavelengths of light. The psychoacoustic model describes how sound can be highly compressed without degrading the perceived quality of the sound. Flaws caused by lossy compression that are noticeable to the human eye or ear are known as compression artifacts.

[edit] Methods

[edit] Graphics

[edit] Image

Cartesian Perceptual Compression: Also known as CPC
DjVu
Fractal compression
HAM, hardware compression of color information used in Amiga computers
ICER, used by the Mars Rovers: related to JPEG 2000 in its use of wavelets
JPEG
JPEG 2000, JPEG's successor format that uses wavelets.
JBIG2
PGF, Progressive Graphics File (lossless or lossy compression)
Wavelet compression

[edit] Video

H.261
H.263
H.264/MPEG-4 AVC
MNG (supports JPEG sprites)
Motion JPEG
MPEG-1 Part 2
MPEG-2 Part 2
MPEG-4 Part 2
Ogg Theora (noted for its lack of patent restrictions)
Sorenson video codec
VC-1

[edit] Audio

[edit] Music

AAC
ADPCM
ATRAC
Dolby AC-3
MP2
MP3
Musepack
Ogg Vorbis (noted for its lack of patent restrictions)
WMA

[edit] Speech

CELP
G.711
G.726
Harmonic and Individual Lines and Noise (HILN)
AMR (used by GSM cell carriers, such as T-Mobile)
Speex (noted for its lack of patent restrictions)

[edit] Other data

Researchers have (half-jokingly) performed lossy compression on text by either using a thesaurus to substitute short words for long ones, or generative text techniques ^[1], although these sometimes fall into the related category of lossy data conversion.

[edit] See also

[edit] Notes

^ I. H. WITTEN, et al.. Semantic and Generative Models for Lossy Text Compression (PDF). The Computer Journal. Retrieved on 2007-10-13.

[edit] External links

Lossy audio formats, comparing the speed and compression strength of five lossy audio formats.
lossy PNG image compression (research)
using lossy GIF/PNG compression for the web (article)

Categories: Data compression | Lossy compression algorithms