Metacharacter

A metacharacter is a character that has a special meaning (instead of a literal meaning) to a computer program, such as a shell interpreter or a regular expression engine.

In POSIX extended regular expressions,[1] there are 14 metacharacters that must be preceded by a backslash "\" in order to drop their special meaning and be treated literally inside an expression: the open/close square brackets, "[" and "]"; the backslash "\"; the caret "^"; the dollar sign "$"; the period or dot "."; the vertical bar or pipe symbol "|"; the question mark "?"; the asterisk "*"; the plus-sign "+"; open/close curly braces, "{" and "}"; and open/close parenthesis, "(" and ")".[2]

If you want to use any of these characters as a literal in a regex, you need to escape them with a backslash. For example, to match the arithmetic expression "(1+1)*3=6" with a regex, then the correct regex is "\(1\+1\)\*3=6". Otherwise, the parenthesis, plus-sign, and asterisk will have a special meaning.

Examples

Escaping

The term "To escape a metacharacter" means to make the metacharacter ineffective (to strip it out of its special meaning) and hence to be used in its regular direct form. For example, in PCRE, a period (.) stands for "any single character can come here", and a more concrete example would be A.C, while the period between them can be B (or even a single spacing) or any other applicable character (a single period stands for exactly one character); If we escape the period, it will lose its potency as a metacharacter and will be just what it is - A period.

Usual ways to escape characters in regex is using the backslash symbol (\). Another way is a double hyphen (--) which makes a total escaping of a row.

See also

References


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.