Left-to-right mark

The left-to-right mark (LRM) is a control character or invisible formatting character, used in the computerized typesetting of text that contains mixed left-to-right scripts (such as English and Russian) and right-to-left scripts (such as Arabic, Persian and Hebrew). It is used to set the way adjacent characters are grouped with respect to text direction.

Unicode

In Unicode, LRM is encoded U+200E left-to-right mark (HTML ‎ · &lrm;). UTF-8 is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.

Example of use in HTML

Suppose the writer wishes to inject a run of English text (i.e. left-to-right) text into an Arabic or Hebrew paragraph, with non-alphabetic characters at the end of the English text (on the right). "The language C++ is a programming language used..." in Arabic, but with the "C++" in English renders as follows:

‫ لغة C++ هي لغة برمجة تستخدم...

With an LRM entered in the HTML after the ++, it renders as follows:

‫ لغة C++‎ هي لغة برمجة تستخدم...

Standards-compliant browsers will render the ++ on the left in the first example, and on the right in the second. This happens because the browser recognizes that the paragraph is in a RTL script (Arabic), and applies punctuation, which is neutral as to its direction, in coordination with the more prominent (paragraph level) adjacent text. The LRM causes the punctuation to be adjacent to only LTR text – the "C" and the LRM – and hence position as if it were in left-to-right text, i.e., to the right of the preceding text. ‎ or &lrm; may be required by some software rather than the invisible Unicode character itself; the actual invisible character would also make copy editing difficult.

External links

Unicode

Unicode Consortium
ISO/IEC 10646 (Universal Character Set)
Versions

Code points

Block
Characters
Character charts
Character property
Plane
Private Use Area

Characters

Special purpose	BOM Combining grapheme joiner Left-to-right mark / Right-to-left mark Soft hyphen Word joiner Zero-width joiner Zero-width non-joiner Zero-width space

Lists	CJK Unified Ideographs Combining character Duplicate characters Numerals Scripts Spaces Symbols Halfwidth and fullwidth

Processing

Algorithms	Bi-directional text Collation ISO 14651 Equivalence

Comparison	BOCU-1 CESU-8 Punycode SCSU UTF-1 UTF-7 UTF-8 UTF-9/UTF-18 UTF-16/UCS-2 UTF-32/UCS-4 UTF-EBCDIC

On pairs of
code points

Usage

Related standards

Related topics

Scripts and symbols in Unicode

Common and inherited scripts	Combining marks Diacritics Punctuation Space

Modern scripts	Arabic diacritics Armenian Balinese Bamum Batak Bengali Bopomofo Braille Buhid Burmese Canadian Aboriginal Chakma Cham Cherokee CJK Unified Ideographs (Han) Cyrillic Deseret Devanagari Ge'ez Georgian Greek Gujarati Gurmukhī Hangul Hanja Hanunó'o Hebrew diacritics Hiragana Javanese Kanji Kannada Katakana Kayah Li Khmer Khudawadi Lao Latin Lepcha Limbu Lisu (Fraser) Lontara Malayalam Manchu Mandaic Meetei Mayek Mende Kikakui Modi Mro Miao (Pollard) Mongolian N'Ko New Tai Lue Ol Chiki Oriya Osmanya Pahawh Hmong Pau Cin Hau Rejang Samaritan Śāradā Saurashtra Shavian Sinhala Sorang Sompeng Sundanese Sylheti Nagari Syriac Tagalog (Baybayin) Tagbanwa Tai Le Tai Tham Tai Viet Takri Tamil Telugu Thaana Thai Tirhuta Tibetan Tifinagh Vai Warang Citi Yi

Ancient and historic scripts	Avestan Bassa Vah Brāhmī Carian Caucasian Albanian Coptic Cuneiform Cypriot Egyptian hieroglyphs Elbasan Glagolitic Gothic Grantha Imperial Aramaic Inscriptional Pahlavi Inscriptional Parthian Kaithi Kharosthi Khojki Linear A Linear B Lycian Lydian Mahajani Manichaean Meroitic Nabataean Ogham Old Italic Old North Arabian Old Permic Old Persian cuneiform Old Turkic Palmyrene 'Phags-pa Phoenician Psalter Pahlavi Runic Siddham South Arabian Ugaritic

Notational scripts	Duployan

Symbols	Cultural, political, and religious symbols Currency Mathematical operators and symbols Phonetic symbols (including IPA)

Left-to-right mark

Unicode

Example of use in HTML

See also

External links