Dynamic range compression, also called DRC (often seen in DVD player settings) or simply compression, is a process that reduces the dynamic range of an audio signal. Compression is used during sound recording, live sound reinforcement, and broadcasting to control the level of audio. A compressor is the device used to apply compression.
Contents |
In simple terms, a compressor is an automatic volume control. Loud sounds over a certain threshold are reduced in level while quiet sounds remain untreated-- (this is known as downward compression, while the less common upward compression involves making sounds below the threshold louder while the louder passages remain unchanged). In this way it reduces the dynamic range of an audio signal. This may be done for aesthetic reasons, to deal with technical limitations of audio equipment, or to improve audibility of audio in noisy environments.
In a noisy environment, background noise can overpower quiet sounds (such as listening to a car stereo while driving). A comfortable listening level for loud sounds makes the quiet sounds inaudible below the noise; a comfortable listening level for quiet sounds makes the loud sounds too loud. Compression is used in order to make both the soft and loud parts of a sound more tolerable at the same volume setting.
Compression reduces the level of the loud sounds, but not the quiet sounds, thus, the level can be raised to a point where the quiet sounds are more audible without the loud sounds being too loud. Contrast this with the complementary process of an expander, which performs almost the exact opposite function of a compressor, i.e., an expander increases the dynamic range of the audio signal.[1]
A compressor reduces the gain (level) of an audio signal if its amplitude exceeds a certain threshold. The amount of gain reduction is determined by a ratio. For example, with a ratio of 4:1, when the (time averaged) input level is 4 dB over the threshold, the output signal level will be 1 dB over the threshold. The gain (level) has been reduced by 3 dB. When the input level is 8 dB above the threshold, the output level will be 2 dB; a 6 dB gain reduction.
A more specific example for a 4:1 ratio:
The signal entering a compressor is split, with one copy sent to a variable-gain amplifier and the other to a path called the side-chain, where a control circuit calculates the required amount of gain reduction. The control-circuit outputs the requested gain-reduction amount to the amplifier. This type of design is known as feed-forward type and is used today on most compressors. Early compressor designs were based on a feedback type layout where the signal feeding the control circuit was taken after the amplifier.
The variable-gain amplifier is the component reducing the gain of the signal. There are a number of technologies used for this purpose, each having different advantages and disadvantages. Vacuum tubes are used in configuration called 'variable-µ' where the grid-to-cathode voltage changes to alter the gain.[2] Also used is a voltage controlled amplifier which has its gain reduced as the power of the input signal increases. Optical compressors use a light sensitive resistor (LDR) and a small lamp (LED or Electroluminescent panel) to create changes in signal gain. This technique is believed by some to add smoother characteristics to the signal, because the response times of the light and the resistor soften the attack and release. Other Technologies used include Field Effect Transistors and a Diode Bridge.[3]
When working with digital audio, digital signal processing techniques are commonly used to implement compression via digital audio editors, or dedicated workstations. Often the algorithms used emulate the above analog technologies.
Threshold is the level above which the signal is reduced. It is commonly set in dB, where a lower threshold (e.g. -60 dB) means a larger portion of the signal will be treated (compared to a higher threshold of -5 dB).
The ratio determines the input/output ratio for signals above the threshold. For example, a 4:1 ratio means that a signal overshooting the threshold by 4 dB will leave the compressor 1 dB above the threshold. The highest ratio of ∞:1 is commonly achieved using a ratio of 60:1, and effectively denotes that any signal above the threshold will be brought down to the threshold level (unless some attack is in force).
A compressor might provide a degree of control over how quickly it acts. The 'attack phase' is the period when the compressor is increasing gain reduction to reach the level that is determined by the ratio. The 'release phase' is the period when the compressor is decreasing gain reduction to the level determined by the ratio, or, to zero, once the level has fallen below the threshold. The length of each period is determined by the rate of change and the required change gain reduction. For more intuitive operation, a compressor's attack and release controls are labelled as a unit of time (often milliseconds). This is the amount of time it will take for the gain to change a set amount of dB, decided by the manufacturer, very often 10 dB. For example, if the compressor's time constants are referenced to 10 dB, and the attack time is set to 1 ms, it will take 1 ms for the gain reduction to rise from 0 dB to 10 dB, and 2 ms to rise from 0 dB to 20 dB[4].
In many compressors the attack and release times are adjustable by the user. Some compressors, however, have the attack and release times determined by the circuit design and these cannot be adjusted by the user. Sometimes the attack and release times are 'automatic' or 'program dependent', meaning that the times change depending on the input signal. Because the loudness pattern of the source material is modified by the compressor it may change the character of the signal in subtle to quite noticeable ways depending on the settings used.
Another control a compressor might offer is hard/soft knee. This controls whether the bend in the response curve is a sharp angle or has a rounded edge. A soft knee slowly increases the compression ratio as the level increases and eventually reaches the compression ratio set by the user. A soft knee reduces the audible change from uncompressed to compressed, especially for higher ratios where the changeover is more noticeable. [5]
A peak sensing compressor responds to the instantaneous level of the input signal. While providing tighter peak control, peak sensing might yield very quick changes in gain reduction, more evident compression or sometimes even distortion. Some compressors will apply an averaging function (commonly RMS) on the input signal before its level is compared to the threshold. This allows a more relaxed compression that also more closely relates to our perception of loudness.
A compressor in stereo linking mode applies the same amount of gain reduction to both the left and right channels. This is done to prevent image shifting that could occur if each channel is compressed individually and content on one channel is louder than that on the other (an example would be a tom hit in a drum-mix for a tom panned extreme left).
Stereo linking can be achieved in two ways: Either the compressor sums to mono the left and right channel at the input, then only the left channel controls are functional; or, the compressor still calculates the required amount of gain reduction independently for each channel and then apply the highest amount of gain reduction to both (in such case it could still make sense to dial different setting on the left and right channel as one might wish to have less compression for left-side events[6]).
Because the compressor is reducing the gain (or level) of the signal, the ability to add a fixed amount of make-up gain at the output is provided so that an optimum level can be used.
The look-ahead function is designed to overcome the problem of being forced to compromise between slow attack rates that produce smooth-sounding gain changes, and fast attack rates capable of catching transients. Look-ahead is a misnomer in that the future is not actually observed. Instead, the input signal is split, and one side is delayed. The non-delayed signal is used to drive the compression of the delayed signal, which then appears at the output. This way a smooth-sounding slower attack rate can be used to catch transients. The cost of this solution is that the signal is delayed.
An audio engineer might use a compressor subtly in order to reduce the dynamic range of source material in order to allow it to be recorded optimally on a medium with a more limited dynamic range than the source signal, or they might use a compressor in order to deliberately change the character of an instrument being processed.
Engineers wishing to achieve dynamic range reduction with few obvious effects might choose a relatively high threshold and low compression ratio so that the source material is being compressed very slightly most of the time. To deliberately soften the attack of a snare drum, they might choose a fast attack time and a moderately fast release time combined with a higher threshold. To accentuate the attack of the snare, they might choose a slower attack time to avoid affecting the initial transient. It is easier to successfully apply these controls if the user has a basic knowledge of musical instrument acoustics.
It should be noted that compression can also be used to lift the soft passages of a selection, pulling the sound toward a compressed "middle". Hence, loud sounds are pulled back and soft passages are boosted.
Compression and limiting are no different in process, just in degree and in the perceived effect. A limiter is a compressor with a higher ratio, and generally a fast attack time. Most engineers consider a ratio of 10:1 or more as limiting, although there are no set rules.[7] Engineers sometimes refer to soft and hard limiting which are differences of degree. The "harder" a limiter, the higher its ratio and the faster its attack and release times.
Brick wall limiting has a very high ratio and a very fast attack time. Ideally, this ensures that an audio signal never exceeds the amplitude of the threshold. Ratios of 20:1 all the way up to ∞:1 are considered to be 'brick wall'.[7] The sonic results of more than momentary and infrequent hard/brick-wall limiting are usually characterized as harsh and unpleasant; thus it is more appropriate as a safety device in live sound and broadcast applications than as a sound-sculpting tool.
Some modern consumer electronics devices incorporate limiters. Sony uses the Automatic Volume Limiter System (AVLS), on some audio products and the PlayStation Portable.
Side-chaining uses the signal level of another input or an equalized version of the original input to control the compression level of the original signal. For sidechains that key off of external inputs, when the external signal is stronger, the compressor acts more strongly to reduce output gain. This is used by disc jockeys to lower the music volume automatically when speaking; in this example, the DJ's microphone signal is converted to line level signal and routed to a stereo compressor's sidechain input. The music level is routed through the stereo compressor so that whenever the DJ speaks, the compressor reduces the volume of the music, a process called ducking. The sidechain of a compressor that has EQ controls can be used to reduce the volume of signals that have a strong spectral content within the frequency range of interest. Such a compressor can be used as a de-esser, reducing the level of annoying vocal sibilance in the range of 6-9 kHz. A frequency-specific compressor can be assembled from a standard compressor and an equalizer by feeding a 6-9 kHz-boosted copy of the original signal into the side-chain input of the compressor. A de-esser helps reduce high frequencies that tend to overdrive preemphasized media (such as phonograph records and FM radio). Another use of the side-chain in music production serves to maintain a loud bass track, while still keeping the bass out of the way of the drum when the drum hits.
A stereo compressor without a sidechain can be used as a mono compressor with a sidechain. The key or sidechain signal is sent to the first (main) input of the stereo compressor while the signal that is to be compressed is routed into and out of the second channel of the compressor.
One technique is to insert the compressor in a parallel signal path. This is known as parallel compression and can give a measure of dynamic control without significant audible side effects, if the ratio is relatively low and the compressor's sound is relatively neutral. On the other hand, a high compression ratio with significant audible artifacts can be chosen in one of the two parallel signal paths — this is used by some concert mixers as an artistic effect called New York compression. Combining a linear signal with a compressor and then reducing the output gain of the compression chain results in low-level detail enhancement without any peak reduction (since the compressor will significantly add to the combined gain at low levels only). This will often be beneficial when compressing transient content, since high-level dynamic liveliness is still maintained, despite the overall dynamic range reduction.
Multiband (also spelled multi-band) compressors can act differently on different frequency bands. The advantage of multiband compression over full-bandwidth (full-band, or single-band) compression is that changing signal levels in one frequency band (such as from a sporadic low frequency kick drum) don't cause unneeded audible gain changes or "pumping" in other frequency bands.
Multiband compressors work by first splitting the signal through some number of bandpass filters or crossover filters. The frequency ranges or crossover frequencies may be adjustable. Each split signal then passes through its own compressor and is independently adjustable for threshold, ratio, attack, and release. The signals are then recombined and an additional limiting circuit may be employed to ensure that the combined effects do not create unwanted peak levels.
Software plug-ins or DSP emulations of multiband compressors can be complex, with many bands, and require corresponding computing power.
Multiband compressors are primarily an audio mastering tool, but their inclusion in digital audio workstation plug-in sets is increasing their use among mix engineers. Hardware multiband compressors are also commonly used in the on-air signal chain of a radio station, either AM or FM, in order to increase the station's apparent loudness without fear of overmodulation. Having a louder sound is often considered an advantage in commercial competition. However, adjusting a multiband output compressor of a radio station also requires some artistic sense of style, plenty of time and a good pair of ears. This is because the constantly changing spectral balance between audio bands may have an equalizing effect on the output, by dynamically modifying the on-air frequency response. A further development of this approach is programmable radio output processing, where the parameters of the multiband compressor automatically change between different settings according to the current programme block style or the time of day.
Serial compression is a technique used in sound recording and mixing. Serial compression is achieved by using two fairly different compressors in a signal chain. One compressor will generally stabilize the dynamic range while the other will more aggressively compress stronger peaks. This is the normal internal signal routing in common combination devices marketed as "compressor-limiters", where an RMS compressor (for general gain control) would be directly followed by a fast peak sensing limiter (for overload protection). Done properly, even heavy serial compression can sound very natural in a way not possible with a single compressor. It is most often used to even out erratic vocals and guitars.
Compression is often used to make music sound louder without increasing its peak amplitude. By compressing the peak (or loudest) signals, it becomes possible to increase the overall gain (or volume) of a signal without exceeding the dynamic limits of a reproduction device or medium. The net effect, when compression is applied along with a gain boost, is that relatively quiet sounds become louder, while louder sounds remain unchanged.
Compression is often applied in this manner in audio systems for restaurants, retail, and similar public environments, where background music is played at a relatively low volume and needs to be compressed not just to keep the volume fairly constant, but also in order for relatively quiet parts of the music to be audible over ambient noise, or audible at all.
Compression can be used to increase the average output gain of a power amplifier by 50 to 100% with a reduced dynamic range. For paging and evacuation systems, this adds clarity under noisy circumstances and saves on the number of amplifiers required.
Compression is often used in music production to make performances more consistent in dynamic range so that they "sit" in the mix of other instruments better and maintain consistent attention from the listener. Vocal performances in rock music or pop music are usually compressed in order to make them stand out from the surrounding instruments and to add to the clarity of the vocal performance.
Compression can also be used on instrument sounds to create effects not primarily focused on boosting loudness. For instance, drum and cymbal sounds tend to decay quickly, but a compressor can make the sound appear to have a more sustained tail. Guitar sounds are often compressed in order to obtain a fuller, more sustained sound.
Most devices capable of compressing audio dynamics can also be used to reduce the volume of one audio source when another audio source reaches a certain level; see Side-Chaining above.
A compressor can be used to reduce sibilance ('ess' sounds) in vocals by feeding the compressor with an EQ set to the relevant frequencies, so that only those frequencies activate the compressor. If unchecked, sibilance could cause distortion even if sound levels are not very high. This usage is called 'de-essing'. [2]
Compression is used in voice communications in amateur radio that employ SSB modulation. Often it is used to make a particular station's signal more readable to a distant station, or to make one's station's transmitted signal stand out against others. This occurs especially in pileups where amateur radio stations are competing for the opportunity to talk to a DX station. Since an SSB signal's amplitude depends on the level of modulation, the net result is that the average amplitude of the signal and hence average transmitted power would be stronger than it would be had compression not been used.[8] Most modern amateur radio SSB transceivers have speech compressors built in.
Compression is also used in land mobile radio, especially in transmit audio of professional walkie-talkies and in remote control dispatch consoles.
Compression is used extensively in broadcasting to boost the perceived volume of sound while reducing the dynamic range of source audio (typically CDs) to a range that can be accommodated by the narrower-range broadcast signal. Broadcasters in most countries have legal limits on instantaneous peak volume they may broadcast. Normally these limits are met by permanently inserted hardware in the on-air chain (see multiband compression above).
As was alluded to above, the use of compressors to boost perceived volume is a favorite trick of broadcasters who want their station to sound "louder" at the same volume than comparable stations on the dial. The effect is to make the more heavily compressed station "jump out" at the listener at a given volume setting. This technique was begun with competitive AM rock stations of the 1960s. AM broadcasters had no qualms about heavy compression since AM radio had such poor dynamic range anyway. The Gates Sta-level was an often used compressor that would reduce "highs" and boost "lows" to yield a very "punchy" sound with the perceived increased volume energy mentioned above.
Heavy compression also complemented the style of 60s DJs who talked/shouted over the music. With the proper setting, a DJ could be "mixed" into the music, rather than being heard over it. This demanded that DJs deliver their patter with a very loud voice to be heard over the music, which added to the energy of the broadcasted sound (and which led to the much-parodied style of DJs who spoke with seeming over-emphasis on their words (called "pukers" in the business). This allowed DJs to talk "in" rather than over the music without being as intrusive.
As rock became prevalent on FM in the mid-60s, the CBS Volumax/Audimax was one legendary compression rig used, favored because it only "expanded" (lifting soft volume) if any existed. Consequently, it wouldn't expand an unmodulated signal, avoiding the boosting of the noise floor (hiss) as many previous units did. However, it could create an annoying "sucking and pumping" effect (compression and expansion) if driven too severely.
In its effort to deliver a constant modulation (volume level) to the listener, compression works against the wider dynamic range of FM (as compared to AM) which was traditionally one of FM's obvious strong points. Consequently, the so-called "album rock" stations of the 70s and classical music and "easy listening" stations of that era in particular, avoided heavy compression. Classical stations hardly use any, which explains why a classical listener, particularly in the car, must keep turning the volume up and down, constantly fighting the ambient noise prevalent in car listening.
The same recording can have very different dynamics when heard via AM, FM, CD, or other media (although frequency response and noise are large factors as well).
With the advent of the CD and digital music, record companies, mixing engineers and mastering engineers have been gradually increasing the overall volume of commercial albums. Originally they would just push the volume up so that the single loudest point was at full volume, but more recently by using higher degrees of compression and limiting during mixing and mastering, compression algorithms have been engineered specifically to accomplish the task of maximizing audio level in the digital stream. Hard limiting or hard clipping can result, affecting the tone and timbre of the music in a way that one critic describes as "dogshit". [9] The effort to increase loudness has been referred to as the "loudness wars".
Most television commercials are compressed heavily (typically to a dynamic range of no more than 3dB) in order to achieve near-maximum perceived loudness while staying within permissible limits. This is the explanation for the chronic problem that TV viewers and listeners have noticed for years. While commercials receive heavy compression for the same reason that radio broadcasters have traditionally used it (to achieve a "loud" audio image), TV program material, particularly old movies with soft dialog, are comparatively uncompressed by TV stations. This results in commercials which blow the viewer out of his/her seat, since the volume has been turned up to hear soft program audio. This problem is a difficult one to solve, because much TV program audio, particularly the aforementioned old movies, has very little audio energy in it. Consequently, there isn't much that can be electronically "expanded" with a compressor, in an attempt to even out the volume. Even across the cable TV dial with a myriad of audio program volume sources, there is a wide disparity of audio volume levels.
A compressor is sometimes used to reduce the dynamic range of a signal for transmission, to be expanded afterwards. This reduces the effects of a channel with limited dynamic range. See Companding.
Gain pumping, where a regular amplitude peak (such as a kick drum) causes the rest of the mix to change in volume due to the compressor, is generally avoided in music production. However, many dance and hip-hop musicians purposefully use this phenomenon, causing the mix to alter in volume rhythmically in time with the beat.
A compressor is used in hearing aids to bring the audio volume in the range of the ears of the patient. To allow the patient to still hear the direction from which the sound is coming, binaural compression may be required.
Some software audio players support plugins which implement compression. These can be used to increase the perceived volume of audio tracks, or to even out the volume of highly-variable music (such as classical music, or a playlist spanning many music types). This improves the listenability of audio when played through poor-quality speakers, or when played in noisy environments (such as in a car or during a party). Such software may also be used in micro-broadcasting or home-based audio mastering.
Available software includes:
To achieve volume-compressed playback on devices other than computer-based audio players, files may need to be processed via the above software then output as wavs, mp3s, or other audio formats.