DOC (computing)

Word Document
Filename extension .doc
Internet media type application/msword
Uniform Type Identifier com.microsoft.word.doc[1]
Developed by Microsoft
Type of format Word document
Container for Text, Image,Table
Extended to Microsoft Office XML formats, Office Open XML
Open format? no

In computing, DOC or doc (an abbreviation of 'document') is a filename extension for word processing documents; most commonly for Microsoft Word. Historically, the extension was used for documentation in plain-text format, particularly of programs or computer hardware, on a wide range of operating systems. During the 1980s, WordPerfect used DOC as the extension of their proprietary format. Later, in the 1990s, Microsoft chose to use the DOC extension for their proprietary Microsoft Word word processing formats. The original uses for the extension have largely disappeared from the PC world.

Contents

Microsoft's DOC binary file format

Binary DOC files often contain more text formatting information (as well as scripts and undo information) than some other document file formats like Rich Text Format and HyperText Markup Language, but are usually less widely compatible.

The DOC format varies among Microsoft Office Word Formats. Word versions up to 97 used a different format from Microsoft Word version between 97 and 2003.

In Microsoft Office Word 2007 the binary file format was replaced as the default format by the Office Open XML format.

Application support

The DOC format is native to Microsoft Office Word, but other word processors, such as OpenOffice.org Writer, IBM Lotus Symphony, Google Docs, Apple Pages and AbiWord, can create and read .doc files, although with some limitations. Command line programs for Unix-like operating systems which can convert files from the DOC format to plain text or other standard formats include the wv library, which itself is used directly by AbiWord.

Specification

Because the .doc file format was a closed specification for many years, inconsistent handling of the format persists and may cause some loss of formatting information when handling the same file with multiple word processing programs. Some specifications for MS Office 97 binary file formats were published in 1997 under a restrictive license, but these specifications were removed from online download in 1999.[2][3][4][5] Specifications of later versions of MS Office binary file formats were not publicly available. The DOC format specification was available from Microsoft on request[6] since 2006[7] under restrictive RAND-Z terms until February 2008. Following reverse engineering the documentation which was done by Sun and OpenOffice.org,[8] Microsoft released a .DOC format specification[9] under the Microsoft Open Specification Promise.[10][11] However, this specification does not describe all of the features used by DOC format and reverse engineering remains necessary.[12]

Other file formats

Some historical documentations may use the .doc filename extension for plain-text file format. The .doc filename extension was also used in historical versions of WordPerfect for its proprietary format.

Some software applications use the name "DOC" in combination with other words (such as the name of software manufacturer) for different file formats. As an example, on the Palm OS, DOC is shorthand for PalmDoc, a completely unrelated format (commonly using filename extension ".pdb") used to encode text files such as ebooks.

References

  1. ^ "System-Declared Uniform Type Identifiers (Mac OS X v10.4)". Apple Developer Connection. Apple Inc.. 2008-04-08. http://developer.apple.com/documentation/Carbon/Conceptual/understanding_utis/utilist/chapter_4_section_1.html#//apple_ref/doc/uid/TP40001319-CH205-BHACGADF. 
  2. ^ (pdf) Comparing ODF and OOXML, 2006, http://marketing.openoffice.org/ooocon2006/presentations/wednesday_o3.pdf, retrieved 2011-05-23 
  3. ^ Beware of Geeks Bearing Gifts, 2006, http://www.robweir.com/blog/2006/11/beware-of-geeks-bearing-gifts.html, retrieved 2011-05-23 
  4. ^ A Word 8 converter for Unix, ftp://ftp.gwdg.de/pub/gnu/www/software/mswordview/MSWordView.html, retrieved 2011-05-23 
  5. ^ "Microsoft Word 97 Binary File Format". http://www.opennet.ru/docs/formats/wword8.html#01. Retrieved 2011-05-23. 
  6. ^ "Royalty-free specifications for Microsoft Office binary file formats". http://www.wictorwilen.se/Post/Royaltyfree-specifications-for-Microsoft-Office-binary-file-formats.aspx. Retrieved 2011-05-23. 
  7. ^ "Mapping documents in the binary format (.doc; .xls; .ppt) to the Open XML format". 2008-01-16. http://blogs.msdn.com/b/brian_jones/archive/2008/01/16/mapping-documents-in-the-binary-format-doc-xls-ppt-to-the-open-xml-format.aspx. Retrieved 2011-05-23. 
  8. ^ "Microsoft Compound Document Format". OpenOffice.org. 2007-08-07. http://sc.openoffice.org/compdocfileformat.pdf. 
  9. ^ "Microsoft Office Word 97 - 2007 Binary File Format Specification (*.doc)". Microsoft Corporation. 2007. http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/Word97-2007BinaryFileFormat(doc)Specification.pdf. 
  10. ^ "Microsoft Open Specification Promise". Microsoft Corporation. March 23, 2009. http://www.microsoft.com/interop/osp/default.mspx. 
  11. ^ "How to extract information from Office files by using Office file formats and schemas". http://support.microsoft.com/kb/840817/en-us. Retrieved 2011-05-23. 
  12. ^ Joel Spolsky. "Why are the Microsoft Office file formats so complicated? (And some workarounds)". http://www.joelonsoftware.com/items/2008/02/19.html. Retrieved 2011-05-23. 

External links