Private Use (Unicode)

In Unicode, Private Use is a concept to allow characters to be defined and used by private agreement between parties (that is, not involving Unicode), using specified code points. Such a private definition may include publication of a font that supports the definition (showing the characters), and processes to support privately-defined graphic or even control effects (e.g. a clickable <do print> character). As a stability rule, the Unicode Standard guarantees these Private Use code points will never be assigned regular characters, so Unicode will never interfere with the private agreement. The private agreement may be published, and often is.

For example, Apple Inc. has published the Apple control key sign (⌘) to be encoded at Private-use code point U+F8FF <private-use-F8FF>, and maintains this in its fonts and systems.

By definition, multiple private parties may define a specific code point this way, with the consequence that a user can experience using the wrong font, seeing characters from another definition set.

1 Definition
2 Private Use Areas
3 Background
4 Usage
- 4.1 Tentative coordination
- 4.2 Example code point U+F8FF
5 References

Definition

Unicode defines that Private-use code points are assigned characters (as opposed to, say, reserved code points), but no specifics are defined, and properties can be overruled by the private agreement. Part of the stability of the standard is that these code points will never be assigned a regular Unicode character:

Characters in these [Private Use] areas will never be defined by the Unicode Standard. These code points can be freely used for characters of any purpose, but successful interchange requires an agreement between sender and receiver on their interpretation.^[1]^[2]

Just all Private-use characters have General Category=Other, private use (Co).

Private Use Areas

There are three blocks of private-use code points, each is a Private Use Area. In the Basic Multilingual Plane (plane 0) is block Private Use Area with 6400 code points, and in plane 15 and 16 are blocks Supplemental Private Use Area-A and Supplemental Private Use Area-B respectively with 65.534 code points each. The two PUA Planes in Unicode are composed by using surrogate pairs from the basic BMP plane. The high surrogates are those in BMP-block High Private Use Surrogates (U+DB80..U+DBFF, 128 code points), combined with all low surrogates (1028 code points). The 1-to-1 mapping between surrogate-pair and U+xxxxxx code point is defined in UTF-16.

Private Use Areas in Unicode (General Category=Co)^[a]^[b]
Range	Plane	Block name	Number of code points	Note
U+E000..U+F8FF	BMP (0)	Private Use Area	6400
U+F0000..U+FFFFD	PUP (15)^[c]	Supplemental Private Use Area-A	65534	Based on block High Private Use Surrogates (U+DB80..U+DBFF) in BMP, using UTF-16.
U+100000..U+10FFFD	PUP (16)^[c]	Supplemental Private Use Area-B	65534
Notes ^ Unicode Standard chapter 2 ^ Unicode Standard chapter 16.5 ^ Private Use Plane: Unicode has not published identifying names for planes 15 and 16. Chapter 2.8 says The two Private Use Planes (Planes 15 and 16), while the PUA block names used are Supplemental PUA-A and Supplemental PUA-B. Final code points U+xxFFFE and U+xxFFFF in the blocks are not Private-use characters.

Background

In earlier encodings, the concept of private use was present. East Asian systems used End User Character Definition (EUCD)^[1].

In ASCII, the C1 control block containes two Private Use codes: U+0091 <control-0091> (Named: private use one, PU1) and U+0092 <control-0092> (Named: private use two, PU2). Although the C1 controls are incorporated in Unicode, PU1 and PU2 are not considered Private Use characters by Unicode.^[3]

Usage

Tentative coordination

A lot of persons and institutes have published self-defined charqacters in using PUA. To prevent unnecessary overlap, an informal organisation maintains and publishes an incomplete list of private-use publications. By publishing this overview, publishers can aim for unused or less used code points, thereby preventing overlaps. But by definition, this cannot be a guaranteed single-use because every party can use PUA code points at free choiche.

The list is maintained by ConScript Unicode Registry (which is not related to Unicode Consortium).

Example code point U+F8FF

Unicode code point U+F8FF or  is the last code point in the Private Use Area in BMP. Its meaning and appearance vary depending on the font in use, but its usage in several fonts makes it the most notable code point in the private use area.

Some early Tengwar fonts map Elvish characters to it.
The Imitari font draws it as a capital eth.
The font Luxi draws it as the euro sign.
The font "Standard Symbols L" uses it as one of the box drawing characters.
The official PRC standard on precomposed Tibetan uses the codepoint for the Tibetan syllable "hwo".
Some font makers place a copyright statement or other creator's mark at that code point.
- For example, the dingbats font "DavysDingbats" uses it to display a face, presumably that of the font's creator.
- In most Apple-supplied fonts, it represents the Apple logo, or an early version of the command key.
The ConScript Unicode Registry suggests it be used for the Klingon glyph "KLINGON MUMMIFICATION GLYPH." This is followed by e.g. Code2000.
In Wingdings 1,  is the Windows logo. In some computers, however, it is  (U+F000) instead of .

References

^ ^a ^b Unicode Standard chapter 16.5 Private Use characters
^ Unicode Standard chapter 2: General Structure
^ ISO C1 Control Character Set of ISO 6429 (1983)

Unicode

Code points

Code point
Plane
Block
Mapping characters
Private Use
Character property
Character charts

Characters

Special purpose	BOM Combining grapheme joiner Left-to-right mark and Right-to-left mark Soft hyphen Zero-width non-breaking space Zero-width joiner Zero-width non-joiner Zero-width space

Miscellaneous lists	Combining character Duplicate characters Graphic characters

Processing

Algorithms	Bi-directional text Collation (ISO 14651) Equivalence

Transformation	BOCU-1 CESU-8 UTF-1 UTF-7 UTF-8 UTF-9/UTF-18 UTF-16/UCS-2 UTF-32/UCS-4 UTF-EBCDIC Punycode SCSU Comparison

On pairs
of code points

Usage

Related standards

Private Use (Unicode)

Contents

Definition

Private Use Areas

Background

Usage

Tentative coordination

Example code point U+F8FF

References