HTML e-mail

From Wikipedia, the free encyclopedia

HTML e-mail is the use of a subset of HTML (often ill-defined) to provide formatting and semantic markup capabilities in e-mail that are not available with plain text.

Most graphical e-mail clients support HTML e-mail, and many default to it.[1] Many of these clients include both a GUI editor for composing HTML e-mails and a rendering engine for displaying received HTML e-mails.

HTML mail allows the sender to properly express quotations (as in inline replying), headings, bulleted lists, emphasized text, subscripts and superscripts, and other visual and typographic cues to improve the readability and aesthetics of the message, as well as semantic information encoded within the message, such as the original author and Message-ID of a quote. Long URLs can be linked to without being broken into multiple pieces, and text is wrapped to fit the width of the user agent's viewport, instead of uniformly breaking each line at 78 characters (defined in RFC 2822, which was necessary on older text terminals). It allows in-line inclusion of tables, as well as diagrams or mathematical formulae as images, which are otherwise difficult to convey (typically using ASCII art).

Contents

[edit] Adoption

Since its conception, a number of people have vocally opposed all HTML e-mail (and even MIME itself), for a variety of reasons. While still considered inappropriate in many newsgroup postings and mailing lists, its adoption for personal and business mail has only increased over time. Some of those who strongly opposed it when it first came out now see it as mostly harmless.[2]

According to surveys by online marketing companies, adoption of HTML-capable email clients is now nearly universal, with less than 3% reporting that they use text-only clients.[3] A smaller number, though still the majority, prefer it over plain text.[4]

[edit] Compatibility

As HTML mail is more complex than plain text, however, it is also more prone to compatibility issues and problems with rendering consistently across platforms and software.

Some popular clients do not render consistently with W3C specifications, and many HTML e-mails are not compliant, either, which may cause rendering or delivery problems, especially for users of MSN or Hotmail.[3]

In particular, the <head> tag, which is used to house CSS style rules for an entire HTML document, is not well supported, sometimes stripped entirely, causing in-line style declarations to be the de facto standard, even though they are not optimal from a semantic web point of view.[5] Although workarounds have been developed,[6] this has caused no shortage of frustration among newsletter developers, spawning the grassroots Email Standards Project, which grades email clients on their rendering of an acid test, and lobbies developers to improve their products.[7] To persuade Google to improve rendering in Gmail, for instance, they published a video montage of grimacing web developers, resulting in attention from an employee.

[edit] Style

Some senders may excessively rely upon large, colorful, or distracting fonts, making messages more difficult to read.[8] Those who especially dislike certain types of formatting can override them in their user agent while still seeing other formatting and getting the other benefits of HTML. For instance, Mozilla Thunderbird makes it easy to specify a minimum font size.

[edit] Multi-part formats

The default e-mail format according to RFC 2822 is plain text. Thus e-mail software isn't required to support HTML formatting. Sending HTML formatted e-mails can therefore lead to problems at the recipient's end if it's one of those clients that don't support it. The recipient may see the HTML source code or nothing at all.

Many e-mail clients are configured to automatically generate a plain text version of a message and send it along with the HTML version, to ensure that it can be read even by text-only e-mail clients, using the Content-Type: multipart/alternative, as specified in RFC 1521.[9][10][11] The message itself is of type multipart/alternative, and contains two parts, the first of type text/plain, which is read by text-only clients, and the second with text/html, which is read by HTML-capable clients. The plain text version may be missing important formatting information, however. (For example, an equation may lose a superscript and take on an entirely new meaning.)

Many mailing lists deliberately block HTML e-mail, either stripping out the HTML part to just leave the plain text part or rejecting the entire message.

[edit] Message size

HTML e-mail is larger than plain text. Even if no special formatting is used, there will be the overhead from the tags used in a minimal HTML document, and if formatting is heavily used it may be much higher. Multi-part messages, with duplicate copies of the same content in different formats, increase the size even further. The plain text section of a multi-part message can be retrieved by itself, though, using IMAP's FETCH command.[12]

Although the difference in download time between plain text and mixed message mail (which can be a factor of ten or more) was of concern in the 1990s (when most users were accessing e-mail servers through slow modems), on a modern connection the difference is negligible, especially when compared to images, music files, or other common attachments.[13]

[edit] Security vulnerabilities

HTML allows for a link to have a different target than the link's text. This can be used in phishing attacks, in which users are fooled into believing that a link points to the website of an authoritative source (such as a bank), visiting it, and unintentionally revealing personal details (like bank account numbers) to a scammer.

If an e-mail contains web bugs (inline content from an external server, such as a picture), the server can alert a third party that the e-mail has been opened. This is a potential privacy risk, revealing that an e-mail address is real (so that it can be targeted in the future) and revealing when the message was read. For this reason, some e-mail clients do not load external images until requested to by the user.

During periods of increased network threats, the US Department of Defense converts all incoming HTML e-mail to text e-mail.[14]

The multipart type is intended to show the same content in different ways, but this is sometimes abused; some e-mail spam takes advantage of the format to trick spam filters into believing that the message is legitimate. They do this by including innocuous content in the text part of the message and putting the spam in the HTML part (which is what displays to the user).

Most e-mail spam is sent in HTML for these reasons, so spam filters sometimes give higher spam scores to HTML messages.

[edit] See also

[edit] References

  1. ^ Configuring Mail Clients to Send Plain ASCII Text — E-mail client programs
  2. ^ HTML Email: The Poll (Scot Hacker, originator of the much-linked-to Why HTML in E-Mail is a Bad Idea discusses how his feelings have changed since the 90s)
  3. ^ a b Email Marketing Statistics and Metrics
  4. ^ Real-World Email Client Usage: The Hard Data
  5. ^ Not your ordinary html email tips
  6. ^ Premailer: make CSS inline for HTML e-mail
  7. ^ Campaign Monitor: Why we need standards support in HTML email
  8. ^ A pretty fair argument against HTML Email
  9. ^ RFC 1521 7.2.3. The Multipart/alternative subtype
  10. ^ TN1010-11-2: Multipart/Alternative — Gracefully handling HTML-phobic email clients.
  11. ^ Sending HTML and Plain Text E-Mail Simultaneously
  12. ^ Do we really want to send web pages in e-mail?
  13. ^ HTML Email — Still Evil?
  14. ^ DOD bars use of HTML e-mail, Outlook Web Access

[edit] External links