Unicode input

From Wikipedia, the free encyclopedia

Unicode
Character encodings
UCS
Mapping
Bi-directional text
BOM
Han unification
Unicode and HTML
Unicode and E-mail
Unicode typefaces

Many systems provide direct unicode input support in some form to allow selection of arbitrary Unicode characters.

Contents

[edit] Selection from a screen

Many systems provide a way to select unicode characters visually. ISO 14755 refers to this as a screen-selection entry method. On some systems this is limited to characters that are present in a specified font, or where a font containing the character exists at all.

Microsoft Windows has provided a unicode version of the Character Map program since version NT 4.0 - appearing in the consumer edition since XP. This is limited to characters in the Basic Multilingual Plane. Characters are searchable by unicode character name, and the table can be limited to a particular code block.

Mac OS X provides a "character palette" with much the same functionality, along with searching by related characters, glyph tables in a font, etc.

Equivalent tools (such as gucharmap) exist on most Linux desktop environments.

[edit] Hex input

Clause 5.1 of ISO 14755 describes a Basic method whereby a beginning sequence is followed by the hexadecimal representation of the codepoint and the ending sequence. On some systems, this is limited to the BMP (characters up to U+FFFF).

An example of an ISO 14755-conformant system is GTK+, where the beginning sequence is CTRL+SHIFT+U and the ending sequence is null. In some older versions Ctrl and Shift must be held down while entering the number In GTK+ versions before 2.10, Ctrl-Shift-U is not used, only Ctrl-Shift-[hex number]

  • The RichEdit control on Microsoft Windows (as used in for example WordPad) supports the following input method: one first enters the character’s hexadecimal code, then immediately presses Alt + x. For example, entering f1 and then pressing the combination will produce the character ñ. The code must not be preceded by any digit or letters a-f as they will be treated as part of the code to be converted. This also works on Microsoft Word 2002/2003 for Windows.
  • In the Vim editor, the user first types Ctrl-V u, then types in the hexadecimal number of the symbol or character desired, and it will be converted into the symbol. In Emacs, the equivalent command is M-x ucs-insert.
  • In Mac OS X and in Mac OS 8.5 and later: one chooses the Unicode Hex Input keyboard layout. Holding down the Option key, one then types the four-digit hex Unicode code point. On eleasing the Option key; the equivalent character will appear.[1]
  • On Microsoft Windows, if the registry key HKEY_Current_User\Control Panel\Input Method\EnableHexNumpad has a value of "1", holding down alt and pressing the "plus" on the numeric keypad, followed by the hex code, will work.[2]
  • In Linux first press Ctrl+Shift+U, then type the desired hexadecimal code. I.e. type "0041" to get the letter "A".

[edit] Decimal Input

On some applications on Microsoft Windows, particularly those using the RichEdit control, decimal unicode code points (e.g., 256 for U+0100) are supported with Alt codes.

[edit] See also

Wikibooks
Wikibooks has a book on the topic of

[edit] References

  1. ^ typing special and accented characters
  2. ^ How to enter Unicode characters in Microsoft Windows