Codec listening test
From Wikipedia, the free encyclopedia
A codec listening test is a scientific study designed to compare two or more lossy audio codecs, usually with respect to perceived fidelity and/or compression efficiency.
Most tests take the form of a double-blind comparison. Commonly used methods are known as "ABX" or "ABC/HR" or "MUSHRA". There are various software packages available for individuals to perform this type of testing themselves with minimal assistance.
In an ABX test, the listener has to identify an unknown sample X as being A or B, with A (usually the original) and B (usually the encoded version) available for reference. The outcome of a test must be statistically significant. This setup ensures that the listener is not biased by his/her expectations, and that the outcome is not likely to be the result of chance. If sample X cannot be determined reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proved that there is a perceptible difference between samples A and B. This usually indicates that the encoded version will actually be transparent to the listener.
In an ABC/HR test, C is the original which is always available for reference. A and B are the original and the encoded version in randomized order. The listener must first distinguish the encoded version from the original (which is the Hidden Reference that the "HR" in ABC/HR stands for), prior to assigning a score as a subjective judgment of the quality. Different encoded versions can be compared against each other using these scores.
[edit] Results
Many double-blind music listening tests have been carried out. The following table lists the results of several listening tests that have been published online. To obtain meaningful results, listening tests must compare codecs' performance at similar or identical bitrates, since the audio quality produced by any lossy encoder will be trivially improved by increasing the bitrate. If listeners cannot consistently distinguish a lossy encoder's output from the uncompressed original audio, then it may be concluded that the codec has achieved transparency (data compression).
Popular codecs compared in these tests include MP3, AAC (and extensions), Vorbis, Musepack, and WMA. The RealAudio Gecko, ATRAC3, QDesign, and MP3pro codecs appear in some tests, despite much lower adoption as of 2007. Many encoder and decoder implementations (both proprietary and open source) exist for some codecs, such as MP3, which is the oldest and best-known codec still in widespread use today.
The Musepack and Vorbis codecs are the product of open source projects, and were designed to avoid patented algorithms; both have competed well with patented, proprietary codecs such as MP3, AAC, and WMA according to many of these listening tests.
Source | Dates | Codecs | Bitrate (kbit/s) | Implementations | Musical genres | Samples | Listeners | Winner | Comments | |
---|---|---|---|---|---|---|---|---|---|---|
ff123 | 2001 | multiple | ~128 |
|
1 | 16 | Musepack and AAC | |||
ff123 | October 2001-January 2002 | multiple | ~128 |
|
Various | 3 | 25-28 | Musepack or Vorbis | ||
ff123 | July 2002 | multiple | ~64 |
|
Various | 12 | 24-41 | MP3pro | Both Vorbis variants were a close second. | |
Roberto Amorim | June 2003 | AAC | 128 CBR |
|
Various | 10 | 11-18 | QuickTime | ||
Roberto Amorim | July 2003 | multiple | ~128 |
|
Various | 12 | 14-24 | Musepack | AAC, WMA, and Vorbis tied for close second | |
Roberto Amorim | September 2003 | multiple | ~64 |
|
Various | 12 | 30-43 | Nero HE AAC | This test showed that listeners preferred 128 kbit/s MP3 audio encoded by LAME to all the tested codecs at 64 kbit/s, with greater than 99% confidence:
"No codec delivers the marketing plot [sic] of same quality as MP3 at half the bitrates." |
|
Roberto Amorim | January 2004 | MP3 | ~128 |
|
Various | 12 | 11-22 | LAME | Serious issues with Xing and iTunes encodings were discovered after the test, and documented on results page. | |
Roberto Amorim | February 2004 | AAC | ~128 |
|
Various | 12 | 19-29 | iTunes | Open-source FAAC codec improved greatly since previous test | |
Roberto Amorim | May 2004 | multiple | ~128 |
|
Various | 18 | 12-27 | aoTuV and Musepack | ||
Roberto Amorim | June 2004 | multiple | 32 CBR |
|
Various | 18 | 47-77 | Nero Digital | ||
HydrogenAudio user "guruboolez" | July 2004 | multiple | ~175 |
|
Classical | 18 | 1 | Musepack | ||
HydrogenAudio user "guruboolez" | August 2005 | multiple | ~180 |
|
Classical | 18 | 1 | aoTuV | The author reflects on substantial improvements in Vorbis encoding since his previous test (above):
"Vorbis is now –thanks to Aoyumi [creator of aoTuV]– an excellent audio format for 180 kbit/s encodings (and classical music)." |
|
gURuBoOleZZ (French) | August 2005 | MP3 | ~96 |
|
Classic, various | 150 classical, 35 various | 1 | aoTuV and AAC tied (classical), aoTuV (various) | The author selected each participating encoder by pitting multiple encoders against one another in an initial "Darwinian phase." For example, LAME was chosen as the representative MP3 encoder because it clearly outperformed four other MP3 encoders on a subset of the full sample corpus. | |
Sebastian Mares | December 2005 | multiple | ~140 (nominal 128) |
|
Various | 18 | 18-30 | 4-way tie (all except Shine) | "I think this test shows that with the current encoders, the quality at 128 kbit/s is very good... It's time to move to bitrates like 96 kbit/s or even lower (64 kbit/s)." | |
Mp3-tech.org | March 2006 | AAC | 48 |
|
Various | 18 | 10-20 | |||
Sebastian Mares | November 2006 | multiple | ~48 |
|
Various | 20 | 22-34 | Nero Digital | WMA Professional and aoTuV tied for second | |
Sebastian Mares | July 2007 | multiple | ~64 |
|
Various | 18 | 21-33 | Nero Digital and WMA Professional |
[edit] See also
[edit] External links
- Hydrogenaudio - Community audiophile site, host of most non-commercial ABX testing
- ff123's ABC/HR Audio Comparison Utility for Windows
- SoundExpert. Continuous blind listening tests of codecs over the internet