Data: URI scheme

From Wikipedia, the free encyclopedia

The correct title of this article is data: URI scheme. The initial letter is shown capitalized due to technical restrictions.

The data: URI scheme defined in IETF standard RFC 2397, is an URI scheme that allows inclusion of small data items inline, as if they were being referenced to as an external resource. They tend to be far simpler than alternative inclusion methods, such as MIME with cid: or mid:. According to the wording in the RFC, data: URIs are in fact URLs, although they do not actually locate anything.

data: URIs are currently supported by:

Microsoft's Internet Explorer, as of version 7, does not support data: URIs. Early versions of Internet Explorer treated unrecognised about: URIs as HTML source, so something like about:<b>bold</b> in these versions was broadly equivalent to data:text/html,<b>bold</b> in browsers that support data: URIs

Contents

[edit] Advantages

  • HTTP request and header traffic is not required for embedded data, so data: URIs use fewer network resources whenever the overhead of encoding the inline content as a data: URI is smaller than the HTTP overhead.
  • Web browsers are typically configured to use a maximum of two concurrent connections to a server [as per RFC], so inline data frees up a download connection for other content.
  • Browsers manage fewer cache entries for a file that contains data: URIs.
  • Environments with limited or restricted access to external resources may embed content when it is disallowed or impractical to reference externally. For example, an advanced HTML editing field could accept a pasted or inserted image and convert it to a data: URI to hide the complexity of external resources from the user.
  • HTTPS secure environments commonly require full secure streaming for all HTTP requested elements, or the user will be notified of a mixed secure and insecure elements environment. HTTPS requests have significant overhead over common HTTP requests. Possibility to embed data in data URI's improves speed more significantly in this case.

[edit] Disadvantages

  • Embedded content must be extracted and decoded before changes may be made, then re-encoded and re-embedded afterwards.
  • Information that is embedded more than once is redownloaded as part of the containing file, and thus does not benefit from the browser's cache.
  • Browser limits to URI length provide a maximum data size. For example, URIs in Opera used to have limit of 4kB.
  • Data is included as a simple stream, and many processing environments (such as web browsers) may not support using containers (such as multipart/alternative or message/rfc822) to provide greater complexity such as metadata, data compression, or content negotiation.
  • Microsoft's Internet Explorer, as of versions 6 and 7, lacks support.

Under certain conditions there are some possible additional disadvantages:

  • Base64-encoded data: URIs are roughly 33% larger in size than their binary equivalent. See note (1)
  • URL-encoded data: URIs can be up to 200% larger (in extreme cases) than the original text content. See note (1)

(1) Not so a disadvantage if some content encoding mechanism like gzip is used, for example, via a HTTP Content-Encoding header.

[edit] Format

data:[<MIME-type>][;base64],<data>

The encoding is indicated by ;base64. If it's present the data is encoded as base64. Without it the data (as a sequence of octets) is represented using ASCII encoding for octets inside the range of safe URL characters and using the standard %xx hex encoding of URLs for octets outside that range. If <MIME-type> is omitted, it defaults to text/plain;charset=US-ASCII. (As a shorthand, the type can be omitted but the charset parameter supplied.)

[edit] Examples

[edit] HTML

An HTML fragment embedding a picture of a small red dot:

<img src="data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J
REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
vr4MkhoXe0rZigAAAABJRU5ErkJggg==" alt="Red dot" />

Note that as an URI, the data: URI should be formattable with whitespaces, but there are practical issues with how that relates to base64 encoding [1]. Authors should avoid using whitespaces for base64 encoded data: URIs.

[edit] CSS

A CSS rule that includes a background image (again, newlines added for clarity):

ul.checklist > li.complete { margin-left: 20px; background:
  url(
    ABlBMVEUAAAD///+l2Z/dAAAAM0lEQVR4nGP4/5/h/1+G/58ZDrAz3D/McH8yw83NDDeN
    Ge4Ug9C9zwz3gVLMDA/A6P9/AFGGFyjOXZtQAAAAAElFTkSuQmCC) top left no-repeat; }

Please read CSS2.1 URI syntax for more information on allowed characters and quoting.

[edit] JavaScript

A JavaScript statement that opens an embedded subwindow, as for a footnote link:

window.open('data:text/html;charset=utf-8,%3C!DOCTYPE%20HTML%20PUBLIC%20%22-'+
  '%2F%2FW3C%2F%2FDTD%20HTML%204.0%2F%2FEN%22%3E%0D%0A%3Chtml%20lang%3D%22en'+
  '%22%3E%0D%0A%3Chead%3E%3Ctitle%3EEmbedded%20Window%3C%2Ftitle%3E%3C%2Fhea'+
  'd%3E%0D%0A%3Cbody%3E%3Ch1%3E42%3C%2Fh1%3E%3C%2Fbody%3E%0D%0A%3C%2Fhtml%3E'+
  '%0D%0A','_blank','height=300,width=400');

[edit] Inclusion in HTML or CSS using PHP

Because data URLs are not human readable, a website author might prefer the encoded data be included in the page via a scripting language such as PHP. This has the advantage that if the included file changes, no modifications need to be made to the HTML file, and also of keeping a separation between binary data and text based formats. Disadvantages include greater server CPU and disk use, since it must read and encode the file on every request, although this could be somewhat lessened using a server-side page cache.

<?php
function data_url($file, $mime) 
{  
  $contents = file_get_contents($file);
  $base64   = base64_encode($contents); 
  return ('data:' . $mime . ';base64,' . $base64);
}
?>
 
<img src="<?php echo data_url("elephant.png","image/png")?>" alt="An elephant" />

Similarly, if CSS is processed by PHP, the above function may also be used:

<?php header('Content-type: text/css');?>

div.menu
{
  background-image:url(<?php echo data_url("menu_background.png","image/png")?>);
}

In either case, client or server side features/UA detection/discrimination systems, (like conditional comments) may be used to provide a standard http: URL for Internet Explorer and other older browsers.

[edit] See also

  • An alternative for attaching resources to an HTML document is MIME HTML, usually found in HTML email messages.
  • MIME for the used mediatypes

[edit] External links

In other languages