Gapless playback

From Wikipedia, the free encyclopedia

Gapless playback is the seamless playback of digital audio formats. It allows live music or consecutive tracks to be heard exactly as they are mastered, without gaps between tracks. Gapless playback is natural for compact discs or for gramophone records, but is not always available with compressed digital audio. This can be problematic for fans of music where continuity is important, such as opera, classical music, progressive rock, and electronic music.

Contents

[edit] Causes for Gaps

There are two main reasons why gaps occur during playback: Compression Scheme Artifacts, and Design Choice.

[edit] Compression scheme artifacts

Most lossy audio compression schemes add a small amount of silence to the beginning of a track. One reason that this happens is because many such schemes involve a time/frequency domain transform (such as an MDCT) which can introduce gaps called encoder delay. These gaps can be enlarged at decode time when a reverse-MDCT is performed, because the reverse transform will also introduce gaps (decoder delay) of its own. Another factor is the fact that transforms act on data in units of fixed-size blocks. In order for the audio signal to be encoded in its entirety, small amounts of silence are appended to the input before the transform. If the amount of padded silence is not accounted for, the padding will be decoded together with the audio data, also introducing gaps between tracks. Due to the introduction of such gaps, the playtime of the audio data is often slightly increased.[1]

This issue is technical but also standards-related. The popular MP3 standard, for example, defines no way to record the amount of delay or padding for later removal. Also, the encoder delay may vary from encoder to encoder, making automatic removal difficult.[2] Even if two tracks are decompressed and merged into a single track, a gap will usually remain between them. More recent compressed audio formats (such as Ogg Vorbis) have been designed to address this problem, and can therefore produce gapless audio if played back correctly.

[edit] Design choice

Even when the audio file itself does not contain undesirable gaps, software/firmware/hardware design often adds gaps during playback. In some cases, software closes and re-opens the output stream when switching tracks, causing the hardware to create a very short "click". This problem is solved in more sophisticated designs of gapless playback.

A different design problem relates to software/firmware/hardware which are not ready to seamlessly move to the next track by the time the current track is complete. In this scenario, the listener is left waiting in silence as the player locates the next file, reads it, decodes the first blocks if necessary and then starts loading the buffer for playback. The gap can be as much as half a second, or even more — very noticeable in "continuous" music such as certain classical or dance genres. Unfortunately, as of October 2006, many of the current portable hard disk/flash players suffer from this problem to a greater or lesser degree, including many devices such as those from iRiver and Creative Zen, to name but a few. The Archos Gmini XS202S supports gapless playback. The Rio Karma also supports gapless playback, but is no longer manufactured. The Trekstor Vibez, released in November 2006, is based on the same firmware as the Karma and does support gapless playback. The original (with firmware update) and updated 5th Generation iPod, along with the second generation iPod nano, now support gapless playback as well. The Sony PSP also supports gapless playback.

Many older audio players on personal computers do not implement the required buffering to play gapless audio. Some of these rely on third-party gapless audio plug-ins to buffer output. Some newer players and newer versions of old players now support gapless playback directly.

[edit] Optimal solution

Where gaps are caused by the compression process, it is technically possible to store metadata with the audio to explicitly declare the amount of delay/padding introduced in the encoding process. This information can be used to ensure that playtime will remain constant after decoding with no added silence. The audio playback software must be able to recognize the metadata, and trim the decoded audio as necessary. For uncompressed audio formats this sort of information is generally redundant; the start and end of the audio is well-defined.

To address the effects of poor design, player software needs to achieve two main effects: ensuring the audio hardware itself is not stopped and started between tracks such that a click is added; and looking ahead slightly to process the next track while the current one is running, such that the data for the next track is immediately ready as the current one draws to a close.

If these areas are addressed, such that the software properly decodes the audio data and metadata, the next track is buffered and ready to play and the output stream remains open between tracks, optimal gapless audio is achieved. A collection of consecutive tracks will then play in the same way they were mastered, allowing the listener to hear their album as the author intended.

[edit] Alternative solutions

Digital signal processor (DSP) plugins can be used to detect silence between tracks and trim the audio as necessary on playback. This is not an optimal solution because it does not always produce results identical to the source. Sometimes an artist may intentionally leave silence at track boundaries for dramatic effect; removing this silence also removes that effect.

It can also be difficult to properly implement silence removal. If the silence threshold is too low and the track contains decoder artifacts, the software may not recognise some silences. Conversely, if the threshold is too high, the software may remove entire sections of quiet music at the beginning or end of a track.

DSP plugins can also be used to cross-fade between tracks. This eliminates gaps that some listeners find distracting, but also greatly alters the audio data and is not always desirable. In particular, when tracks are meant to be played together and perform the transition at high volume, cross-fading results in a large volume drop.

Both of these alternate solutions are typically used to address compression methods that do not support the metadata for gapless playback. Like the optimal solution, they still require buffering and not closing the output stream; however, they require more computations, making them less efficient. In portable digital audio players, this can mean a reduced playing time on batteries.

Due to the drawbacks of the alternative solutions above, some listeners dislike their negative effects more than the gap they attempt to remove. Another problem is that the solutions above do nothing to prevent the output stream from being closed and reopened at track boundaries; some measures can be taken to simulate a gapless output stream, but they are not always successful and side-effects may occur.

Another alternative is to ignore track boundaries, encoding a single collection of tracks as a single compressed file, relying on cue sheets (or something similar) for navigation. While this method results in gapless playback within the collection of tracks with consecutive playback, it can be unwieldy due to the possibly large size of the resulting compressed file. Furthermore, unless the playback software or hardware can recognize the cue sheets, navigating between tracks may be difficult.

[edit] Format support

  • Since lossless data compression excludes the possibility of the introduction of padding, all lossless audio file formats are inherently gapless.
  • The following lossy audio file formats have provisions for gapless encoding.
  • Some other formats do not officially support gapless encoding, but some implementations of encoders or decoders may handle gapless metadata.
    • LAME-encoded MP3 can be gapless with players that support the LAME Mp3 info tag.
    • AAC in MP4 encoded with Nero Digital from Nero AG can be gapless with foobar2000 or latest XMMS2.
    • AAC in MP4 encoded with iTunes (current and previous versions) is gapless in iTunes 7.0 onwards, 2nd generation iPod nanos, all video-capable iPods with the latest firmware, and recent versions of foobar2000.
    • iTunes-encoded MP3 is gapless when played back in iTunes 7.0 onwards, 2nd generation iPod nanos, and all video-capable iPods with the latest firmware.

[edit] Gapless solutions

[edit] See also

[edit] References

  1. ^ Taylor, Mark (2003). LAME Technical FAQ. Retrieved on 2006-07-06.
  2. ^ Robinson, David (2001). lame v3.81 and 3.87 beta mp3 decoding quality test results. Retrieved on 2006-08-24. Features a table of encoder delay values.
  3. ^ Thread on Gapless Playback on Amarok Mailing List (2006-09-06). Retrieved on 2007-01-19.

[edit] External link

In other languages