Syntax highlighting

From Wikipedia, the free encyclopedia

For syntax highlighting of source in Wikipedia and other wikis using MediaWiki, see Wikipedia:Text editor support.
HTML syntax highlighting
HTML syntax highlighting

Syntax highlighting is a feature of some text editors that displays text—especially source code—in different colors and fonts according to the category of terms. This feature eases writing in a structured language such as a programming language or a markup language as both structures and syntax errors are visually distinct. Some editors also integrate syntax highlighting with other features, such as spell checking or code folding.

Contents

[edit] Benefits

Syntax highlighting is one strategy to improve the readability and context of the text; especially for code that spans several pages. The reader can easily ignore large sections of comments or code, depending on what one desires.

Syntax highlighting also helps programmers find errors in their program. For example, most editors highlight string literals in a different color. Consequently, spotting a missing delimiter is much easier because of the contrasting color of the text.

Brace matching is another important feature with many popular editors. This makes it simple to see if a brace has been left out or locate the match of the brace the cursor is on by highlighting the pair in a different color.

Some text editors can also export the color markup in a format that is suitable for printing or for importing into word-processing or other kinds of text-formatting software; for instance a HTML, colorized LaTeX, PostScript or RTF version of its syntax highlighting.

[edit] Document class

For editors that support more than one language, the user can specify the language of the text, such as C, LaTeX, HTML, or the text editor can automatically recognize it based on the file extension or by scanning contents of the file.

One approach for supporting syntax highlighting for multiple languages is the use of "Document classes." Each separate language constitutes a separate "class" of document. Each "class" can be associated with a specific set of syntax coloring rules. This approach affords a certain degree of flexibility for multi-language editors, but there are some potential limitations as well.

For example, sometimes a user may want to:

  • treat a single document as belonging to more than just one document class (for example when editing an HTML file that contains embedded Javascript code).
  • edit a language that does not belong to a recognized class (for example when editing source code for an obscure or relatively new programming language).

Some multi-language editors that use the "Document class" method sometimes provide additional features that specifically address these limitations.

[edit] Syntax elements

Most editors with syntax highlighting allow different colors and text styles to be given to dozens of different lexical sub-elements of syntax. These include keywords, comments, control-flow statements, variables, and other elements. Programmers often heavily customize their settings in an attempt to show as much useful information as possible without making the code difficult to read.

[edit] Example

Below is a snippet of syntax highlighted C++ code:

// Allocate all the windows
for (int i = 0; i < max; i++)
{
    wins[i] = new Window();
}

In this example, the editor has recognized the keywords for, int, and new. It recognized the variable names i, wins, and max and highlighted them accordingly. The comment at the beginning is also highlighted in a specific manner to distinguish it from working code.

[edit] History and limitations

The Live Parsing Editor (LEXX or LPEX) was written for the computerization of the Oxford English Dictionary in 1985 and was probably the first to use color syntax highlighting. Its live parsing capability allowed user-supplied parsers to be added to the editor, for text, programs, data file, etc. See: LEXX – A programmable structured editor, Cowlishaw, M. F., IBM Journal of Research and Development, Vol 31, No. 1, 1987, IBM Reprint order number G322-0151

Since most text editors highlight syntax based on complex pattern matching heuristics rather than actually implementing a parser for each possible language, which could be prohibitively complex, the highlighting is almost never completely accurate. Moreover, depending on the pattern matching algorithms, the highlighting "engine" can become very slow for certain types of language structures. Some editors overcome this problem by not always parsing the whole file but rather just the visible area, sometimes scanning backwards in the text up to a limited number of lines for "syncing."

See the Programming features section of the Comparison of text editors article for a list of some editors that have syntax highlighting.