HTML5 Audio

From Wikipedia, the free encyclopedia

HTML5 Audio is a subject of the HTML5 specification, investigating audio input, playback, synthesis, as well as speech to text in the browser.

<audio> element

The <audio> element represents a sound, or an audio stream.[1] It is commonly used to play back a single audio file within a web page, showing a GUI widget with play/pause/volume controls.

Supported browsers

Supported audio codecs

This table documents the current support for audio codecs by the <audio> element.

Browser Operating system Formats supported by different web browsers
Ogg Vorbis WAV PCM MP3 AAC WebM Vorbis Ogg Opus
Google Chrome All supported 9 Yes Yes Yes Yes 25 (since v31 in Windows)
Internet Explorer Windows No No 9 9 No No
Mozilla Firefox All supported 3.5 3.5 Windows (21.0), Linux (24.0), OS X (26.0) Windows (21.0) and Linux (24.0) only 4.0 15.0
Opera All supported 10.50 11.00 14 14 10.60 14
Safari OS X Yes 3.1 3.1 3.1 No No

The adoption of HTML5 audio, as with HTML5 video, has become polarized between proponents of free and patented formats. In 2007, the recommendation to use Vorbis was retracted from the specification by the W3C together with that to use Ogg Theora, citing the lack of a format accepted by all the major browser vendors.

Apple and Microsoft, which between them account for around 39% of the browser market[citation needed], support the ISO/IEC-defined formats AAC and the older MP3. They cited[citation needed] superior performance,[citation needed] and the risk of a submarine patent attack from formats which are believed, but not guaranteed, to be “free”.

Mozilla and Opera, controlling 24% of the market, support the free and open, royalty-free Vorbis codec in Ogg and WebM containers, and criticize the patent-encumbered nature of MP3 and AAC, which are guaranteed to be “non-free”.

Google, controlling 27% of the market, has so far provided support for all common formats.

The result is that for a website to guarantee HTML5 audio for all the above browsers, it has to make two formats available: Vorbis, and either MP3 or AAC.

Gecko-based applications and Safari also support PCM audio in a WAVE container.[3]

In 2012, the free and open royalty-free Opus format was released and standartized by IETF. It is supported by Mozilla’s software since Gecko version 15.[4][5][6]

Web Audio API and MediaStream Processing API

The Web Audio API specification developed by W3C describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported.[7]

Mozilla's Firefox browser implements a similar Audio Data API extension since version 4, implemented in 2010 [8] and released in 2011, but Mozilla warns it is non-standard and deprecated, and recommends the Web Audio API instead.[9] Some JavaScript audio processing and synthesis libraries such as Audiolet support both APIs.

The W3C Audio Working Group is also considering the MediaStream Processing API specification developed by Mozilla.[10] In addition to audio mixing and processing, it covers more general media streaming, including synchronization with HTML elements, capture of audio and video streams, and peer-to-peer routing of such media streams.[11]

Supported browsers

  • PC
  • Mobile
    • Google Chrome for Android 28 (Enabled by default since 29)
    • Mobile Safari 6
    • Mozilla Firefox 23 (Enabled by default since 25)
    • Tizen

Web Speech API

The Web Speech API aims to provide an alternative input method for web applications (without using a keyboard). With this API, developers can give web apps the ability to transcribe your voice to text, from your computer's microphone. The recorded audio is sent to speech servers for transcription, after which the text is typed out for you. The API itself is agnostic of the underlying speech recognition implementation and can support both server based as well as embedded recognizers.[14] The HTML Speech Incubator group has proposed the implementation of audio-speech technology in browsers in the form of uniform, cross-platform APIs. The API contains both:[15]

  • Speech Input API
  • Text to Speech API

Google integrated this feature into Google Chrome on March 2011.[16] Letting its users search the web with their voice with code like:

<script type="application/javascript">
    function startSearch(event) {
        event.target.form.submit();
    }
</script>
<form action="http://www.google.com/search">
  <input type="search" name="q" speech required onspeechchange="startSearch">
</form>

Supported Browsers

See also

References

External links

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.