A grapheme (from Greek γράφω (gráphō), meaning "write") is the smallest semantically distinguishing unit in a written language. It does not carry meaning by itself. Graphemes include alphabetic letters, Chinese characters, numerical digits, punctuation marks, and the individual symbols of any of the world's writing systems.
It is usual to transcribe graphemes within angle brackets, to show their special status, such as <a>, <b>, <c>.[1]
Contents |
Different glyphs can be concrete representations of the same, abstract grapheme, which means that they are allographs. For example, lowercase <a> can be seen in two variants: one with a hook at the top and one without - see ɑ.
In an idealistic fully phonemic orthography, a grapheme would correspond to one phoneme. However, all spelling systems are to some extent non-phonemic. Some scholars scale alphabetic orthographies by the complexity of grapheme to phoneme correspondences, where the standard Spanish (Castilian) and Finnish orthographies are considered shallow, German and Czech intermediate, while French and English have deep orthographies.
Multiple graphemes may represent a single phoneme. These are called digraphs (two graphemes for a single phoneme) and trigraphs (three graphemes). For example, the word "ship" may be regarded as containing four graphemes (<s>, <h>, <i>, and <p>) representing the three phonemes /sh/, /i/, and /p/. However, in some languages, a group of graphemes may be treated as a single unit for the purposes of collation; for example, in a Czech dictionary, the section for words that start with "Ch" comes after that for "H".[2]
Furthermore, a particular grapheme can represent different phonemes on different occasions, and vice versa. For instance in English the sound /f/ can be represented by "F", "f", "ff", "ph", "gh", and so on; while the grapheme <f> can also represent the phoneme /v/ (as in the word "of").
In English and other languages, the choice of graphemes is available to convey morphological relationships; for instance, the link between "sign" and "signature" is closer in writing than in speech. In some English personal names and place names, the relationship between the spelling of the name and the pronunciation is so distant that associations among phonemes and graphemes cannot be identified. Examples are Marjoribanks (pronounced Marshbanks) and Featherstonehaugh (pronounced Fanshaw). Moreover, in many other words, the pronunciation has subsequently evolved from a fixed spelling, so that it has to be said that the phonemes represent the graphemes rather than vice versa. And in much technical jargon, the primary medium of communication is the written language rather than the spoken language, so the phonemes appear to represent the graphemes.
In a script such as Japanese kana one grapheme corresponds primarily to a syllable.
Not all graphemes represent phonemes. For example, the logogram ampersand (<&>), was derived from the Latin word, et, and is used for and in many languages. Thus, it does not directly represent any combination of phonemes. Arabic numerals provide a similar example.
|