WAV

Waveform Audio File Format (WAVE/WAV)
Filename extension .wav .wave
Internet media type audio/vnd.wave,[1] audio/wav, audio/wave, audio/x-wav[2]
Type code WAVE
Uniform Type Identifier (UTI) com.microsoft.waveform-audio
Developed by Microsoft & IBM
Initial release 1991 (1991)[3]
Latest release
Multiple Channel Audio Data and WAVE Files
(7 March 2007 (2007-03-07) (update)[4][5])
Type of format audio file format, container format
Extended from RIFF
Extended to BWF, RF64

Waveform Audio File Format (WAVE, or more commonly known as WAV due to its filename extension)[3][6][7][8] (rarely, Audio for Windows)[9] is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs. It is an application of the Resource Interchange File Format (RIFF) bitstream format method for storing data in "chunks", and thus is also close to the 8SVX and the AIFF format used on Amiga and Macintosh computers, respectively. It is the main format used on Windows systems for raw and typically uncompressed audio. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format.

Description

Both WAVs and AIFFs are compatible with Windows, Macintosh, and Linux operating systems. The format takes into account some differences of the Intel CPU such as little-endian byte order. The RIFF format acts as a "wrapper" for various audio coding formats.

Though a WAV file can contain compressed audio, the most common WAV audio format is uncompressed audio in the linear pulse code modulation (LPCM) format. LPCM is also the standard audio coding format for audio CDs, which store two-channel LPCM audio sampled 44,100 times per second with 16 bits per sample. Since LPCM is uncompressed and retains all of the samples of an audio track, professional users or audio experts may use the WAV format with LPCM audio for maximum audio quality.[10] WAV files can also be edited and manipulated with relative ease using software.

The WAV format supports compressed audio, using, on Windows, the Audio Compression Manager. Any ACM codec can be used to compress a WAV file. The user interface (UI) for Audio Compression Manager may be accessed through various programs that use it, including Sound Recorder in some versions of Windows.

Beginning with Windows 2000, a WAVE_FORMAT_EXTENSIBLE header was defined which specifies multiple audio channel data along with speaker positions, eliminates ambiguity regarding sample types and container sizes in the standard WAV format and supports defining custom extensions to the format chunk.[4][5][11]

There are some inconsistencies in the WAV format: for example, 8-bit data is unsigned while 16-bit data is signed, and many chunks duplicate information found in other chunks.

Specification

The WAV file is an instance of a Resource Interchange File Format (RIFF) defined by IBM and Microsoft.[12]

RIFF

A RIFF file is a tagged file format. It has a specific container format (a chunk) that includes a four character tag (FourCC) and the size (number of bytes) of the chunk. The tag specifies how the data within the chunk should be interpreted, and there are several standard FourCC tags. Tags consisting of all capital letters are reserved tags. The outermost chunk of a RIFF file has a RIFF form tag; the first four bytes of chunk data are a FourCC that specify the form type and are followed by a sequence of subchunks. In the case of a WAV file, those four bytes are the FourCC WAVE. The remainder of the RIFF data is a sequence of chunks describing the audio information.

The advantage of a tagged file format is that the format can be extended later without confusing existing file readers.[13] The rule for a RIFF (or WAV) reader is that it should ignore any tagged chunk that it does not recognize.[14] The reader won't be able to use the new information, but the reader should not be confused.

The specification for RIFF files includes the definition of an INFO chunk. The chunk may include information such as the title of the work, the author, the creation date, and copyright information. Although the INFO chunk was defined in version 1.0, the chunk was not referenced in the formal specification of a WAV file. If the chunk were present in the file, then a reader should know how to interpret it, but many readers had trouble. Some readers would abort when they encountered the chunk, some readers would process the chunk if it were the first chunk in the RIFF form,[15] and other readers would process it if it followed all of the expected waveform data. Consequently, the safest thing to do from an interchange standpoint was to omit the INFO chunk and other extensions and send a lowest-common-denominator file. There are other INFO chunk placement problems.

RIFF files were expected to be used in international environments, so there is CSET chunk to specify the country code, language, dialect, and code page for the strings in a RIFF file.[16] For example, specifying an appropriate CSET chunk should allow the strings in an INFO chunk (and other chunks throughout the RIFF file) to be interpreted as Cyrillic or Japanese characters.

RIFF also defines a JUNK chunk whose contents are uninteresting.[17] The chunk allows a chunk to be deleted by just changing its FourCC. The chunk could also be used to reserve some space for future edits so the file could be modified without being rewritten. A later definition of RIFF introduced a similar PAD  chunk.[18]

RIFF WAVE

The toplevel definition of a WAV file is:[19]

<WAVE-form> → RIFF('WAVE'
                   <fmt-ck>            // Format
                   [<fact-ck>]         // Fact chunk
                   [<cue-ck>]          // Cue points
                   [<playlist-ck>]     // Playlist
                   [<assoc-data-list>] // Associated data list
                   <wave-data> )       // Wave data

The definition shows a toplevel RIFF form with the WAVE tag. It is followed by a mandatory <fmt-ck> format chunk that describes the format of the sample data that follows. The format chunk includes information such as the sample encoding, number of bits per channel, the number of channels, the sample rate. The WAV specification includes some optional features. The optional fact chunk reports the number of samples for some compressed coding schemes. The cue point (cue ) chunk identifies some significant sample numbers in the wave file. The playlist chunk allows the samples to be played out of order or repeated rather than just from beginning to end. The associated data list allows labels and notes (labl and note) to be attached to cue points; text annotation (ltxt) may be given for a group of samples (e.g., caption information). Finally, the mandatory wave data chunk contains the actual samples (in the specified format).

Note that the WAV file definition does not show where an INFO chunk should be placed. It is also silent about the placement of a CSET chunk (which specifies the character set used).

The RIFF specification attempts to be a formal specification, but its formalism lacks the precision seen in other tagged formats. For example, the RIFF specification does not clearly distinguish between a set of subchunks and an ordered sequence of subchunks. The RIFF form chunk suggests it should be a sequence container.[20] The specification suggests a LIST chunk is also a sequence: "A LIST chunk contains a list, or ordered sequence, of subchunks."[21] However, the specification does not give a formal specification of the INFO chunk; an example INFO LIST chunk ignores the chunk sequence implied in the INFO description.[22] The LIST chunk definition for <wave-data> does use the LIST chunk as a sequence container with good formal semantics.

The WAV specification allows for not only a single, contiguous, array of audio samples, but also discrete blocks of samples and silence that are played in order. Most WAV files use a single array of data. The specification for the sample data is confused:[23]

The <wave-data> contains the waveform data. It is defined as follows:
  <wave-data>  → { <data-ck> | <data-list> }
  <data-ck>    → data( <wave-data> )
  <wave-list>  → LIST( 'wavl' { <data-ck> | // Wave samples
                                <silence-ck> }... ) // Silence
  <silence-ck> → slnt( <dwSamples:DWORD> ) // Count of silent samples

These productions are confused. Apparently <data-list> (undefined) and <wave-list> (defined but not referenced) should be identical. Even if that problem is fixed, the productions then allow a <data-ck> to contain a recursive <wave-data> (which implies data interpretation problems). The specification should have been something like:

<wave-data>  → { <data-ck> | <wave-list> }
  <data-ck>    → data( <bSampleData:BYTE> ... )
  <wave-list>  → LIST( 'wavl' { <data-ck> | // Wave samples
                                <silence-ck> }... ) // Silence
  <silence-ck> → slnt( <dwSamples:DWORD> ) // Count of silent samples

to avoid the recursion.

WAV files can contain embedded IFF "lists", which can contain several "sub-chunks".[24][25][26]

Metadata

As a derivative of RIFF, WAV files can be tagged with metadata in the INFO chunk. In addition, WAV files can embed any kind of metadata, including but not limited to Extensible Metadata Platform (XMP) data or ID3 tags[27] in extra chunks. Applications may not handle this extra information or may expect to see it in a particular place. Although the RIFF specification requires that applications ignore chunks they do not recognize, some applications are confused by additional chunks.

Popularity

Uncompressed WAV files are large, so file sharing of WAV files over the Internet is uncommon. However, it is a commonly used file type, suitable for retaining first generation archived files of high quality, for use on a system where disk space is not a constraint, or in applications such as audio editing, where the time involved in compressing and uncompressing data is a concern.

The usage of the WAV format has more to do with its familiarity and simple structure. Because of this, it continues to enjoy widespread use with a variety of software applications, often functioning as a "lowest common denominator" when it comes to exchanging sound files among different programs.

Use by broadcasters

In spite of their large size, uncompressed WAV files are sometimes used by some radio broadcasters, especially those that have adopted a tapeless system.

Limitations

The WAV format is limited to files that are less than 4 GiB, because of its use of a 32-bit unsigned integer to record the file size header (some programs limit the file size to 2 GB). Although this is equivalent to about 6.8 hours of CD-quality audio (44.1 kHz, 16-bit stereo), it is sometimes necessary to exceed this limit, especially when greater sampling rates, bit resolutions or channel count are required. The W64 format was therefore created for use in Sound Forge. Its 64-bit header allows for much longer recording times. The RF64 format specified by the European Broadcasting Union has also been created to solve this problem.

Non-audio data

Since the sampling rate of a WAV file can vary from 1 Hz to 4.3 GHz, and the number of channels can be as high as 65535, .wav files have also been used for non-audio data. LTspice, for instance, can store multiple circuit trace waveforms in separate channels, at any appropriate sampling rate, with the full-scale range representing ±1 V or A rather than a sound pressure.[28]

Audio CDs

Audio CDs do not use the WAV file format, using instead Red Book audio. The commonality is that both audio CDs and WAV files encode the audio as uncompressed PCM. WAV is a file format for a computer to use that cannot be understood by most CD players directly. To record WAV files to an Audio CD the file headers must be stripped and the remaining PCM data written directly to the disc as individual tracks with zero-padding added to match the CD's sector size. In order for a WAV file to be able to be burned to a CD, it should be in the 44100 Hz, 16-bit stereo format.

WAV file audio coding formats compared

Audio in WAV files can be encoded in a variety of audio coding formats, such as GSM or MP3, to reduce the file size.

This is a reference to compare the monophonic (not stereophonic) audio quality and compression bitrates of audio coding formats available for WAV files including PCM, ADPCM, Microsoft GSM 06.10, CELP, SBC, Truespeech and MPEG Layer-3.

Format Bitrate (kbit/s) 1 minute (KiB) Sample
11,025 Hz 16 bit PCM 176.4 1292 11k16bitpcm.wav
8,000 Hz 16 bit PCM 128 938 8k16bitpcm.wav
11,025 Hz 8 bit PCM 88.2 646 11k8bitpcm.wav
11,025 Hz µ-Law 88.2 646 11kulaw.wav
8,000 Hz 8 bit PCM 64 469 8k8bitpcm.wav
8,000 Hz µ-Law 64 469 8kulaw.wav
11,025 Hz 4 bit ADPCM 44.1 323 11kadpcm.wav
8,000 Hz 4 bit ADPCM 32 234 8kadpcm.wav
11,025 Hz GSM 06.10 18 132 11kgsm.wav
8,000 Hz MP3 16 kbit/s 16 117 8kmp316.wav
8,000 Hz GSM 06.10 13 103 8kgsm.wav
8,000 Hz Lernout & Hauspie SBC 12 kbit/s 12 88 8ksbc12.wav
8,000 Hz DSP Group Truespeech 9 66 8ktruespeech.wav
8,000 Hz MP3 8 kbit/s 8 60 8kmp38.wav
8,000 Hz Lernout & Hauspie CELP 4.8 35 8kcelp.wav

The above are WAV files; even those that use MP3 compression have the ".wav" extension.

See also

References

  1. Microsoft Corporation (June 1998). "WAVE and AVI Codec Registries - RFC 2361". IETF. Retrieved 2009-12-06.
  2. "File Extension .WAV Details". Filext.com. Retrieved 2015-08-10.
  3. 1 2 IBM Corporation and Microsoft Corporation (August 1991), Multimedia Programming Interface and Data Specifications 1.0 (TXT), retrieved 2009-12-06
  4. 1 2 P. Kabal (2006-06-19). "Audio File Format Specifications - WAVE or RIFF WAVE sound file". McGill University. Retrieved 2010-03-16.
  5. 1 2 "Multiple Channel Audio Data and WAVE Files". Microsoft Corporation. 2007-03-07. Retrieved 2010-03-16.
  6. IBM Corporation and Microsoft Corporation (August 1991). "Multimedia Programming Interface and Data Specifications 1.0". Retrieved 2009-12-06.
  7. Library of Congress (2008-09-12). "WAVE Audio File Format". Retrieved 2009-12-06.
  8. Microsoft Corporation (June 20, 1999). "Waveform Audio File Format, MIME Sub-type Registration - INTERNET-DRAFT". IETF. Retrieved 2009-12-06.
  9. "Information about the Multimedia file types that Windows Media Player supports". Microsoft Help and Support. Microsoft Corporation. 12 May 2008. Retrieved 29 May 2009. Windows uses the Wave Form Audio (WAV) file format to store sounds as waveforms. One minute of Pulse Code Modulation (PCM)-encoded sound can occupy as little as 644 kilobytes (KB) or as much as 27 megabytes (MB) of storage.
  10. Branson, Ryan (21 October 2015). "What Makes WAV Better than MP3". Online Video Converter. Retrieved 18 June 2016.
  11. EBU (July 2009), EBU Tech 3306 - MBWF / RF64: An Extended File Format for Audio (PDF), retrieved 2010-01-19
  12. IBM; Microsoft (August 1991), Multimedia Programming Interface and Data Specifications 1.0
  13. IBM & Microsoft 1991, p. 1-1, "The main advantage of RIFF is its extensibility; file formats based on RIFF can be future-proofed, as format changes can be ignored by existing applications."
  14. IBM & Microsoft 1991, PDF p. 56, "Programs must expect (and ignore) any unknown chunks encountered, as with all RIFF forms."
  15. IBM & Microsoft 1991, PDF p. 60 shows an example WAV file with an INFO chunk in this position.
  16. IBM & Microsoft 1991, pp. 2-17 to 2-18
  17. IBM & Microsoft 1991, pp. 2–18
  18. Microsoft Multimedia Standards Update, New Multimedia Data Types and Data Techniques, Revision 3.0, April 15, 1994, page 6.
  19. IBM & Microsoft 1991, PDF p. 56
  20. IBM & Microsoft 1991, PDF p. 56 specifies sequencing information in the RIFF form of a WAV file consistent with the formalism: "However, <fmt-ck> must always occur before <wave-data>, and both of these chunks are mandatory in a WAVE file."
  21. IBM & Microsoft 1991, PDF p. 23
  22. IBM & Microsoft 1991, PDF p. 21, INAM appears before ICOP
  23. Specification from IBM & Microsoft 1991 which also describes how the production syntax is interpreted.
  24. "WAVE File Format". archive.org. 1999-11-15. Archived from the original on 1999-11-15. Retrieved 2010-03-16.
  25. "WAVE PCM soundfile format". archive.org. 2003-01-20. Retrieved 2010-03-16.
  26. "The WAVE File Format". Retrieved 2010-03-16.
  27. "ExportPCM.cpp - audacity - Audacity: Free, Cross-Platform Audio Editor and Recorder - Google Project Hosting". Code.google.com. Retrieved 2015-08-10.
  28. "LTspice IV" (PDF). Linear Technologies Corporation. 2009. p. 95. Archived from the original (PDF) on 2012-02-27. Retrieved 2015-09-04.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.