Lexical grammar

From Wikipedia, the free encyclopedia

In computer science a lexical grammar can be thought of as the syntax of tokens. That is, the rules governing how a character sequence is divided up into subsequences of characters each of which represents an individual token.

For instance, the lexical grammar for many programming languages specifies that a string starts with a " character and continues until a matching " is found, that an identifier is a sequence of letters and digits, and that a number is a sequence of digits. So in the following character sequence "abc" xyz1 23 the tokens are string, identifier and number (because the space character terminates the sequence of characters forming the identifier.)