Kanji

Kanji (漢字; ) are the adopted logographic Chinese characters hanzi^[1] that are used in the modern Japanese writing system along with hiragana (ひらがな, 平仮名), katakana (カタカナ, 片仮名), Indo Arabic numerals, and the occasional use of the Latin alphabet (known as "rōmaji"). The Japanese term kanji (漢字) for the Chinese characters literally means "Han characters"^[2] and is the same written term in the Chinese language to refer to the character writing system hanzi (漢字). ^[3]

Japanese writing

Kanji

Kana

Uses

Rōmaji

Punctuation

Kanji
Type	Logographic
Languages	Old Japanese, Japanese
Parent systems	Oracle Bone Script Seal Script Clerical Script Kaishu Kanji
Sister systems	Hanja, Zhuyin, Simplified Chinese, Chu Nom, Khitan script, Jurchen script
ISO 15924	`Hani, 500`
Direction	Left-to-right
Unicode alias	Han
Note: This page may contain IPA phonetic symbols.

Chinese characters

Scripts
Precursors · Oracle bone script · Bronze script · Seal script (large, small) · Clerical script · Cursive script · Regular script · Semi-cursive script
Type styles
Imitation Song · Ming · Sans-serif
Properties
Strokes · Stroke order · Radicals · Classification · Section headers
Variants
Standards on character forms
Kangxi Dictionary form Xin Zixing Standard Form of National Characters List of Forms of Frequently Used Characters
Standards on grapheme usage
Graphemic variants · Hanyu Tongyong Zi · Hanyu Changyong Zi · Tōyō kanji · Jōyō kanji
Reforms
Chinese (trad. · simp. · simp.2 · debate) Japanese (old · new · Ryakuji) Korea (Yakja) · Singapore (jiăntǐzì biǎo)
Sinoxenic usage
Kanji · Hanja · Hán tự
Homographs
Literary and colloquial readings
Derivatives
Kokuji · Korean hanja · Chữ Nôm · Zetian characters · Nü Shu · Idu · Kana (Man'yōgana) · Bopomofo · Sawndip · Khitan large script · Khitan small script · Jurchen · Tangut

For a list of words relating to kokuji, see the Japanese-coined CJKV characters category of words in Wiktionary, the free dictionary.

History

Chinese characters first came to Japan on articles imported from China. An early instance of such an import was King of Na Gold Seal given by the Emperor Guangwu of the Han Dynasty in 57 AD.^[4] It is not clear when Japanese people started to gain a command of Classical Chinese by themselves. According to Japanese legends of Nihon Shoki and Kojiki a scholar called Wani (王仁) was dispatched by the Kingdom of Baekje in southwestern Korea to the Japanese Islands during the reign of Emperor Ōjin, bringing with him the knowledge of Confucianism and the Chinese writing system.

The first actual Japanese documents were probably written by Chinese immigrants. For example, the diplomatic correspondence from King Bu of Wa to Emperor Shun of the Liu Song Dynasty in 478 has been praised for its skillful use of allusion. Later, groups of people called fuhito were organized under the monarch to read and write Classical Chinese. From the 6th century onwards, Chinese documents written in Japan tended to show linguistic interference from Japanese, suggesting the wide acceptance of Chinese characters in Japan.

The Japanese language itself had no written form at the time kanji was introduced. Originally texts were written in the Chinese language and would have been read as such. Over time, however, a system known as kanbun (漢文) emerged, which involved using Chinese text with diacritical marks to allow Japanese speakers to restructure and read Chinese sentences, by changing word order and adding particles and verb endings, in accordance with the rules of Japanese grammar.

Chinese characters also came to be used to write Japanese words, resulting in the modern kana syllabaries. A writing system called man'yōgana (used in the ancient poetry anthology Man'yōshū) evolved that used a number of Chinese characters for their sound, rather than for their meaning. Man'yōgana written in cursive style evolved into hiragana, a writing system that was accessible to women (who were denied higher education). Major works of Heian era literature by women were written in hiragana. Katakana emerged via a parallel path: monastery students simplified man'yōgana to a single constituent element. Thus the two other writing systems, hiragana and katakana, referred to collectively as kana, are actually descended from kanji.

In modern Japanese, kanji are used to write parts of the language such as nouns, adjective stems, and verb stems, while hiragana are used to write inflected verb and adjective endings (okurigana), particles, and miscellaneous words which have no kanji or whose kanji is considered obscure or too difficult to read or remember. Katakana are used for representing onomatopoeia, non-Japanese loanwords (except those borrowed from Chinese), the names of plants and animals (with exceptions), and for emphasis on certain words.

Local developments and divergences from Chinese

While kanji are essentially Chinese hanzi used to write Japanese, there are now significant differences between kanji used in Japanese and Chinese characters used in Chinese. Such differences include (i) the use of characters created in Japan, (ii) characters that have been given different meanings in Japanese, and (iii) post-World War II simplifications of the kanji. Likewise, the process of character simplification in mainland China since the 1950s has the result that Japanese speakers who have not studied Chinese may not recognize some simplified characters.

Kokuji

Kokkun

In addition to kokuji, there are kanji that have been given meanings in Japanese different from their original Chinese meanings. These are not considered kokuji but are instead called kokkun (国訓) and include characters such as:

藤 fuji (wisteria; Ch. téng rattan, cane, vine)
沖 oki (offing, offshore; Ch. chōng rinse, minor river (Cantonese))
椿 tsubaki (Camellia japonica; Ch. chūn Ailanthus)

Readings

Reading Characters in Japanese
	Meaning	Pronunciation
a) semantic on	L1	L1
b) semantic kun	L1	L2
c) phonetic on	—	L1
d) phonetic kun	—	L2
*With L1 representing the language borrowed from (Chinese) and L2 representing the borrowing language (Japanese).^[7]

Because of the way they have been adopted into Japanese, a single kanji may be used to write one or more different words (or, in some cases, morphemes). From the point of view of the reader, kanji are said to have one or more different "readings". Deciding which reading is meant depends on context, intended meaning, use in compounds, and even location in the sentence. Some common kanji have ten or more possible readings. These readings are normally categorized as either on'yomi (literally, sound reading) or kun'yomi (literally, meaning reading).

On'yomi (Chinese reading)

The on'yomi (音読み), the Sino-Japanese reading, is the modern descendant of the Japanese approximation of the Chinese pronunciation of the character at the time it was introduced. Some kanji were introduced from different parts of China at different times, and so have multiple on'yomi, and often multiple meanings. Kanji invented in Japan would not normally be expected to have on'yomi, but there are exceptions, such as the character 働 "to work", which has the kun'yomi "hataraku" and the on'yomi "dō", and 腺 "gland", which has only the on'yomi "sen" – in both cases these come from the on'yomi of the phonetic component, respectively 動 "dō" and 泉 "sen".

Generally, on'yomi are classified into four types:

Go-on (呉音^?, "Wu sound") readings are from the pronunciation during the Southern and Northern Dynasties during the 5th and 6th centuries. There is a high probability of Go referring to the Wu region (in the vicinity of modern Shanghai), which still maintains linguistic similarities with modern Japanese.
Kan-on (漢音^?, "Han sound") readings are from the pronunciation during the Tang Dynasty in the 7th to 9th centuries, primarily from the standard speech of the capital, Chang'an (長安 or 长安, modern Xi'an). Here, Kan is used in the sense of China.
Tō-on (唐音^?, "Tang sound") readings are from the pronunciations of later dynasties, such as the Song (宋) and Ming (明). They cover all readings adopted from the Heian era (平安) to the Edo period (江戸). This is also known as Tōsō-on (唐宋音).

Kan'yō-on (慣用音^?, "Idiomatic sound") readings, which are mistaken or changed readings of the kanji that have become accepted into the language. In some cases, they are the actual readings that accompanied the character's introduction to Japan, but do not match how the character “should” be read according to the rules of character construction and pronunciation.

Examples (rare readings in parentheses)
Kanji	Meaning	Go-on	Kan-on	Tō-on	Kan'yō-on
明	bright	myō	mei	(min)	—
行	go	gyō gō	kō kō	(an)	—
極	extreme	goku	kyoku	—	—
珠	pearl	shu	shu	ju	(zu)
度	degree	do	(to)	—	—
輸	transport	(shu)	(shu)	—	yu
雄	masculine	—	—	—	yū
熊	bear	—	—	—	yū
子	child	shi	shi	su	—
清	clear	shō	sei	(shin)	—
京	capital	kyō	kei	(kin)	—
兵	soldier	hyō	hei	—	—
強	strong	gō	kyō	—	—

The most common form of readings is the kan-on one. The go-on readings are especially common in Buddhist terminology such as gokuraku 極楽 "paradise", as well as in some of the earliest loans, such as the Sino-Japanese numbers. The tō-on readings occur in some later words, such as isu 椅子 "chair", futon 布団 "mattress", and andon 行灯,　"a kind of paper lantern".

In Chinese, most characters are associated with a single Chinese sound. However, some homographs called 多音字 (pinyin: duōyīnzì) such as 行 (pinyin: háng or xíng) (Japanese: gō, gyō) have more than one reading in Chinese representing different meanings, which is reflected in the carryover to Japanese as well. Additionally, many Chinese syllables, especially those with an entering tone, did not fit the largely consonant-vowel (CV) phonotactics of classical Japanese. Thus most on'yomi are composed of two morae (beats), the second of which is either a lengthening of the vowel in the first mora, the vowel i, or one of the syllables ku, ki, tsu, chi, or moraic n, chosen for their approximation to the final consonants of Middle Chinese. It may be that palatalized consonants before vowels other than i developed in Japanese as a result of Chinese borrowings, as they are virtually unknown in words of native Japanese origin.

On'yomi primarily occur in multi-kanji compound words (熟語 jukugo), many of which are the result of the adoption, along with the kanji themselves, of Chinese words for concepts that either did not exist in Japanese or could not be articulated as elegantly using native words. This borrowing process is often compared to the English borrowings from Latin, Greek, and Norman French, since Chinese-borrowed terms are often more specialized, or considered to sound more erudite or formal, than their native counterparts. The major exception to this rule is family names, in which the native kun'yomi reading is usually used (though on'yomi are found in many personal names, especially men's names).

Kun'yomi (Japanese reading)

The kun'yomi (訓読み), Japanese reading, or native reading (literally, meaning reading), is a reading based on the pronunciation of a native Japanese word, or yamato kotoba, that closely approximated the meaning of the Chinese character when it was introduced. As with on'yomi, there can be multiple kun readings for the same kanji, and some kanji have no kun'yomi at all.

For instance, the kanji for east, 東, has the on reading tō. However, Japanese already had two words for "east": higashi and azuma. Thus the kanji 東 had the latter readings added as kun'yomi. In contrast, the kanji 寸, denoting a Chinese unit of measurement (about 30 mm or 1.2 inch), has no native Japanese equivalent; it only has an on'yomi, sun, with no native kun reading. Most kokuji, Japanese-created Chinese characters, only have kun readings (although some have back-formed a pseudo-on reading by analogy with similar characters, such as 働 dō, from 動 dō), though some, such as 腺 sen "gland", have only an on'yomi.

Kun'yomi are characterized by the strict (C)V syllable structure of yamato kotoba. Most noun or adjective kun'yomi are two to three syllables long, while verb kun'yomi are usually between one and three syllables in length, not counting trailing hiragana called okurigana. Okurigana are not considered to be part of the internal reading of the character, although they are part of the reading of the word. A beginner in the language will rarely come across characters with long readings, but readings of three or even four syllables are not uncommon.

承る uketamawaru and 志 kokorozashi have five syllables represented by a single kanji, the longest readings of any kanji in the Jōyō character set. These unusually long readings are due respectively to 承る being a single character for a compound verb, one component of which has a long reading (alternative spelling as 受け賜る u(ke)-tamawa-ru, hence (1+1)+3=5; compare common 受け付ける u(ke)-tsu(ke)ru), and to 志 being a nominalization of the verb 志す which has a long reading (kokoroza-su), the nominalization removing the okurigana, hence increasing the reading by one mora, yielding 4+1=5 (compare common 話 hanashi 2+1=3, from 話す hana-su). Longer readings exist for non-Joyo characters and non-kanji symbols, where a long gairaigo word may be the reading (this is classed as kun'yomi – see other readings, below) – the character 糎 has the seven kana reading センチメートル senchimētoru "centimeter", though it is generally written as "cm" (with two half-width characters, so occupying one space); another common example is '%' (the percent sign), which has the five kana reading パーセント pāsento.

In a number of cases, multiple kanji were assigned to cover a single Japanese word. Typically when this occurs, the different kanji refer to specific shades of meaning. For instance, the word なおす, naosu, when written 治す, means "to heal an illness or sickness". When written 直す it means "to fix or correct something". Sometimes the distinction is very clear, although not always. Differences of opinion among reference works is not uncommon; one dictionary may say the kanji are equivalent, while another dictionary may draw distinctions of use. As a result, native speakers of the language may have trouble knowing which kanji to use and resort to personal preference or by writing the word in hiragana. This latter strategy is frequently employed with more complex cases such as もと moto, which has at least five different kanji: 元, 基, 本, 下, and 素, the first three of which have only very subtle differences.

Local dialectical readings of kanji are also classified under kun'yomi, most notably readings for words in Ryukyuan languages. Further, in rare cases gairaigo (borrowed words) have a single character associated with them, in which case this reading is formally classified as a kun'yomi, because the character is being used for meaning, not sound. This is discussed under other readings, below.

When to use which reading

Although there are general rules for when to use on'yomi and when to use kun'yomi, the language is littered with exceptions, and it is not always possible for even a native speaker to know how to read a character without prior knowledge (this is especially true for names, both of people and places); further, a given character may have multiple kun'yomi or on'yomi. When reading Japanese, one primarily recognizes words (multiple characters and okurigana) and their readings, rather than individual characters, and only guess readings of characters when trying to "sound out" an unrecognized word. Homographs exist, however, which can sometimes be deduced from context, and sometimes cannot, requiring a gloss. For example, 今日 may be read either as kyō "today (informal)" (special fused reading for native word) or as konnichi "these days (formal)" (on'yomi); in formal writing this will generally be read as konnichi.

The main guideline is that a single kanji followed by okurigana (hiragana characters that are part of the word) – as used in native verbs and adjectives – always indicates kun'yomi, while kanji compounds (kango) usually use on'yomi, which is usually kan-on; however, other on readings are also common, and kun readings are also commonly used in kango. For a kanji in isolation without okurigana, it is typically read using their kun'yomi, though there are numerous exceptions.

Okurigana are used with kun'yomi to mark the inflected ending of a native verb or adjective, or by convention – note that Japanese verbs and adjectives are closed class, and do not generally admit new words (borrowed Chinese vocabulary, which are nouns, can form verbs by adding -suru (〜する^?, to do) at the end, and adjectives via 〜の -no or 〜な -na, but cannot become native Japanese vocabulary, which inflect). For example: 赤い aka-i "red", 新しい atara-shii "new ", 見る mi-ru "(to) see". Okurigana can be used to indicate which kun reading to use, as in 食べる ta-beru versus 食う ku-u (casual), both meaning "(to) eat", but this is not always sufficient, as in 開く, which may be read as a-ku or hira-ku, both meaning "(to) open". 生 is a particularly complicated example, with multiple kun and on readings – see okurigana: 生 for details. Okurigana is also used for some nouns and adverbs, as in 情け nasake "sympathy", 必ず kanarazu "invariably", but not for 金 kane "money", for instance. Okurigana is an important aspect of kanji usage in Japanese; see that article for more information on kun'yomi orthography

Kanji occurring in compounds are generally read using on'yomi, called 熟語 jukugo in Japanese (though again, exceptions abound). For example, 情報 jōhō "information", 学校 gakkō "school", and 新幹線 shinkansen "bullet train" all follow this pattern. This isolated kanji versus compound distinction gives words for similar concepts completely different pronunciations. 東 "east" and 北 "north" use the kun readings higashi and kita, being stand-alone characters, while 北東 "northeast", as a compound, uses the on reading hokutō. This is further complicated by the fact that many kanji have more than one on'yomi: 生 is read as sei in 先生 sensei "teacher" but as shō in 一生 isshō "one's whole life". Meaning can also be an important indicator of reading; 易 is read i when it means "simple", but as eki when it means "divination", both being on'yomi for this character.

These rules of thumb have many exceptions. Kun'yomi compound words are not as numerous as those with on'yomi, but neither are they rare. Examples include 手紙 tegami "letter", 日傘 higasa "parasol", and the famous 神風 kamikaze "divine wind". Such compounds may also have okurigana, such as 空揚げ (also written 唐揚げ) karaage "Chinese-style fried chicken" and 折り紙 origami, although many of these can also be written with the okurigana omitted (for example, 空揚 or 折紙).

Similarly, some on'yomi characters can also be used as words in isolation: 愛 ai "love", 禅 Zen, 点 ten "mark, dot". Most of these cases involve kanji that have no kun'yomi, so there can be no confusion, although exceptions do occur. A lone 金 may be read as kin "gold" or as kane "money, metal"; only context can determine the writer's intended reading and meaning.

Multiple readings have given rise to a number of homographs, in some cases having different meanings depending on how they are read. One example is 上手, which can be read in three different ways: jōzu (skilled), uwate (upper part), or kamite (stage left/house right). In addition, 上手い has the reading umai (skilled). More subtly, 明日 has three different readings, all meaning "tomorrow": ashita (casual), asu (polite), and myōnichi (formal). Furigana (reading glosses) is often used to clarify any potential ambiguities.

Conversely, in some cases homophonous terms may be distinguished in writing by different characters, but not so distinguished in speech, and hence potentially confusing. In some cases when it is important to distinguish these in speech, the reading of a relevant character may be changed. For example, 私立 (privately established, esp. school) and 市立 (city established) are both normally pronounced shi-ritsu; in speech these may be distinguished by the alternative pronunciations watakushi-ritsu and ichi-ritsu. More informally, in legal jargon 前文 "preamble" and 全文 "full text" are both pronounced zen-bun, so 前文 may be pronounced mae-bun for clarity, as in "Have you memorized the preamble [not 'whole text'] of the constitution?". As in these examples, this is primarily using a kun reading for one character in a normally on'yomi term.

As stated above, 重箱 jūbako and 湯桶 yutō readings are also not uncommon. Indeed, all four combinations of reading are possible: on-on, kun-kun, kun-on and on-kun.

Some famous place names, including those of Tokyo (東京 Tōkyō) and Japan itself (日本 Nihon or sometimes Nippon) are read with on'yomi; however, the majority of Japanese place names are read with kun'yomi: 大阪 Ōsaka, 青森 Aomori, 箱根 Hakone. When characters are used as abbreviations of place names, their reading may not match that in the original. The Osaka (大阪) and Kobe (神戸) baseball team, the Hanshin (阪神) Tigers, take their name from the on'yomi of the second kanji of Ōsaka and the first of Kōbe. The name of the Keisei (京成) railway line, linking Tokyo (東京) and Narita (成田) is formed similarly, although the reading of 京 from 東京 is kei, despite kyō already being an on'yomi in the word Tōkyō.

Japanese family names are also usually read with kun'yomi: 山田 Yamada, 田中 Tanaka, 鈴木 Suzuki. Japanese given names often have very irregular readings – although they are not typically considered jūbako or yutō, they often contain mixtures of kun'yomi, on'yomi and nanori, such as 大助 Daisuke [on-kun], 夏美 Natsumi [kun-on]. Being chosen at the discretion of the parents, the readings of given names do not follow any set rules and it is impossible to know with certainty how to read a person's name without independent verification. Parents can be quite creative, and rumours abound of children called 地球 Āsu and 天使 Enjeru, quite literally "Earth" and "Angel"; neither are common names, and have normal readings chikyū and tenshi respectively. Common patterns do exist, however, allowing experienced readers to make a good guess for most names.

Chinese place names and Chinese personal names appearing in Japanese texts, if spelled in kanji, are almost invariably read with on'yomi. Especially for older and well-known names, the resulting Japanese pronunciation may differ widely from that used by Chinese speakers. For example, Mao Zedong's name, written 毛沢東, is pronounced as Mō Takutō in Japanese. Today, Chinese names that aren't well known in Japan are often spelled in Katakana instead, in a form much more closely approximating the native Chinese pronunciation. Alternatively, they may be written in kanji with katakana furigana.

Pronunciation assistance

Because of the ambiguities involved, kanji sometimes have their pronunciation for the given context spelled out in ruby characters known as furigana, (small kana written above or to the right of the character) or kumimoji (small kana written in-line after the character). This is especially true in texts for children or foreign learners and manga (comics). It is also used in newspapers for rare or unusual readings and for characters not included in the officially recognized set of essential kanji.

Spelling words

Conversely, specifying a given kanji, or spelling out a kanji word – whether the pronunciation is known or not – can be complicated, due to the fact that there is not a commonly used standard way to refer to individual kanji (one does not refer to *"kanji #237"), and that a given reading does not map to a single kanji – indeed there are many homophonous words, not simply individual characters, particularly for kango (with on'yomi readings). Easiest is to write the word out – either on paper or tracing it in the air – or look it up (given the pronunciation) in a dictionary, particularly an electronic dictionary; when this is not possible, such as when speaking over the phone or writing implements are not available (and tracing in air is too complicated), various techniques can be used. These include giving kun readings for characters – these are often unique – using a well-known word with the same character (and preferably the same pronunciation and meaning), and describing the character via its components. For example, one may explain how to spell the word kōshinryō (香辛料^?, spice) via the words kao-ri (香り^?, fragrance), kara-i (辛い^?, spicy), and in-ryō (飲料^?, beverage) – the first two use the kun readings, the third is a well-known compound – saying "kaori, karai, ryō as in inryō".

Total number of kanji

The number of possible characters is disputed. The Daikanwa Jiten contains about 50,000 characters, and this was thought to be comprehensive, but more recent mainland Chinese dictionaries, such as the Yiti Zidian dictionary published in 2004 contain 100,000 or more characters, many consisting of obscure variants. The vast majority of these are not in common use in either Japan or China; as discussed below, approximately 2,000 to 3,000 characters are in common use in Japan.

Orthographic reform and lists of kanji

Main article: Japanese script reform

In 1946, following World War II, the Japanese government instituted a series of orthographic reforms. This was done with the goal of facilitating learning for children and simplifying kanji use in literature and periodicals. The number of characters in circulation was reduced, and formal lists of characters to be learned during each grade of school were established. Some characters were given simplified glyphs, called 新字体 (shinjitai). Many variant forms of characters and obscure alternatives for common characters were officially discouraged.

These are simply guidelines, so many characters outside these standards are still widely known and commonly used; these are known as hyōgaiji (表外字^?).

Kyōiku kanji

Main article: Kyōiku kanji

The Kyōiku kanji (教育漢字, "education kanji") are 1,006 characters that Japanese children learn in elementary school. The number was 881 until 1981. The grade-level breakdown of the education kanji is known as the gakunen-betsu kanji haitōhyō (学年別漢字配当表), or the gakushū kanji.

Jōyō kanji

Main article: Jōyō kanji

The Jōyō kanji (常用漢字, "regular-use kanji") are 2,136 characters consisting of all the Kyōiku kanji, plus 1,130 additional kanji taught in junior high and high school. In publishing, characters outside this category are often given furigana. The Jōyō kanji were introduced in 1981, replacing an older list of 1,850 characters known as the Tōyō kanji (当用漢字, "general-use kanji") introduced in 1946. Originally numbering 1,945 characters, the Jōyō kanji list was extended to 2,136 in 2010. Some of the new characters were previously Jinmeiyō kanji; some are used to write prefecture names: 阪, 熊, 奈, 岡, 鹿, 梨, 阜, 埼, 茨, 栃 and 媛.

Jinmeiyō kanji

Main article: Jinmeiyō kanji

Since September 27, 2004, the Jinmeiyō kanji (人名用漢字, "kanji for use in personal names") consist of 2,928 characters, containing the Jōyō kanji plus an additional 983 kanji found in people's names. There were only 92 kanji in the original list published in 1952, but new additions have been made frequently. Sometimes the term Jinmeiyō kanji refers to all 2,928, and sometimes it only refers to the 983 that are only used for names.

Hyōgaiji

Main article: Hyōgaiji

Hyōgaiji (表外字^?, "unlisted characters") are any kanji not contained in the jōyō kanji and jinmeiyō kanji lists. These are generally written using traditional characters, but extended shinjitai forms exist.

Japanese Industrial Standards for kanji

The Japanese Industrial Standards for kanji and kana define character code-points for each kanji and kana, as well as other forms of writing such as the Latin alphabet, Cyrillic script, Greek alphabet, Hindu-Arabic numerals, etc. for use in information processing. They have had numerous revisions. The current standards are:

JIS X 0208:1997, the most recent version of the main standard. It has 6,355 kanji.
JIS X 0212:1990, a supplementary standard containing a further 5,801 kanji. This standard is rarely used, mainly because the common Shift JIS encoding system could not use it. This standard is effectively obsolete;
JIS X 0213:2000, a further revision which extended the JIS X 0208 set with 3,625 additional kanji, of which 2,741 were in JIS X 0212. The standard is in part designed to be compatible with Shift JIS encoding;
JIS X 0221:1995, the Japanese version of the ISO 10646/Unicode standard.

Gaiji

Gaiji (外字), literally meaning "external characters", are kanji that are not represented in existing Japanese encoding systems. These include variant forms of common kanji that need to be represented alongside the more conventional glyph in reference works, and can include non-kanji symbols as well.

Gaiji can be either user-defined characters or system-specific characters. Both are a problem for information interchange, as the codepoint used to represent an external character will not be consistent from one computer or operating system to another.

Gaiji were nominally prohibited in JIS X 0208-1997, and JIS X 0213-2000 used the range of code-points previously allocated to gaiji, making them completely unusable. Nevertheless, they persist today with NTT DoCoMo's "i-mode" service, where they are used for emoji (pictorial characters).

Unicode allows for optional encoding of gaiji in private use areas, while Adobe's SING (Smart INdependent Glyphlets)^[8]^[9] technology allows the creation of customized gaiji.

The Text Encoding Initiative uses a <g> element to encode any non-standard character or glyph, including gaiji.^[10] (The g stands for "gaiji".^[11])

Types of Kanji: by category

Main article: Chinese character classification

A Chinese scholar Xu Shen (許慎), in the Shuōwén Jiězì (說文解字) ca. 100 CE, classified Chinese characters into six categories (Japanese: 六書 rikusho). The traditional classification is still taught but is problematic and no longer the focus of modern lexicographic practice, as some categories are not clearly defined, nor are they mutually exclusive: the first four refer to structural composition, while the last two refer to usage.

(For a table of all the kyōiku kanji (教育漢字) broken down by category see this page, from which the following descriptions have been extracted.)

Shōkei-moji (象形文字)

These characters are pictograms, sketches of the object they represent. For example, 目 is an eye, 木 is a tree, etc. (Shōkei 象形 is also the Japanese word for Egyptian hieroglyphs). The current forms of the characters are very different from the original, and it is now hard to see the origin in many of these characters. It is somewhat easier to see in seal script. These make up a small fraction of modern characters.

Shiji-moji (指事文字)

Shiji-moji are ideograms, often called "simple ideograms" or "simple indicatives" to distinguish them and tell the difference from compound ideograms (below). They are usually simple graphically and represent an abstract concept such as 上 "up" or "above" and 下 "down" or "below". These make up a tiny fraction of modern characters.

Kaii-moji (会意文字)

These are compound ideograms, often called "compound indicatives", "associative compounds", or just "ideograms". These are usually a combination of pictograms that combine iconically to present an overall meaning. An example is the kokuji 峠 (mountain pass) made from 山 (mountain), 上 (up) and 下 (down). Another is 休 (rest) from 人 (person) and 木 (tree). These make up a tiny fraction of modern characters.

Keisei-moji (形声文字)

These phono-semantic or radical-phonetic compounds, sometimes called "semantic-phonetic", "semasio-phonetic", or "phonetic-ideographic" characters, are by far the largest category, making up about 90% of the characters in the standard lists; however, some of the most frequently used kanji belong to one of the three groups mentioned above, so Keisei-moji will usually make up less than 90% of the characters in a text. Typically they are made up of two components, one of which (most commonly, but by no means always, the left or top element) suggests the general category of the meaning or semantic context, and the other (most commonly the right or bottom element) approximates the pronunciation. (The pronunciation really relates to the original Chinese, and may now only be distantly detectable in the modern Japanese on'yomi of the kanji; it generally has no relation at all to kun'yomi. The same is true of the semantic context, which may have changed over the centuries or in the transition from Chinese to Japanese. As a result, it is a common error in folk etymology to fail to recognize a phono-semantic compound, typically instead inventing a compound-indicative explanation.)

As examples of this, consider the kanji with the 言 shape: 語, 記, 訳, 説, etc. All are related to word/language/meaning. Similarly kanji with the 雨 (rain) shape (雲, 電, 雷, 雪, 霜, etc.) are almost invariably related to weather. Kanji with the 寺 (temple) shape on the right (詩, 持, 時, 侍, etc.) usually have an on'yomi of "shi" or "ji". Sometimes one can guess the meaning and/or reading simply from the components. However, exceptions do exist – for example, neither 需 nor 霊 have anything to do with weather (at least in their modern usage), and 待 has an on'yomi of "tai". That is, a component may play a semantic role in one compound, but a phonetic role in another.

Tenchū-moji (転注文字)

This group have variously been called "derivative characters", "derivative cognates", or translated as "mutually explanatory" or "mutually synonymous" characters; this is the most problematic of the six categories, as it is vaguely defined. It may refer to kanji where the meaning or application has become extended. For example, 楽 is used for 'music' and 'comfort, ease', with different pronunciations in Chinese reflected in the two different on'yomi, gaku 'music' and raku 'pleasure'.

Kasha-moji (仮借文字)

These are rebuses, sometimes called "phonetic loans". The etymology of the characters follows one of the patterns above, but the present-day meaning is completely unrelated to this. A character was appropriated to represent a similar sounding word. For example, 来 in ancient Chinese was originally a pictograph for "wheat". Its syllable was homophonous with the verb meaning "to come", and the character is used for that verb as a result, without any embellishing "meaning" element attached. The character for wheat 麦, originally meant "to come", being a Keisei-moji having 'foot' at the bottom for its meaning part and "wheat" at the top for sound. The two characters swapped meaning, so today the more common word has the simpler character. This borrowing of sounds has a very long history.

Related symbols

The iteration mark (々) is used to indicate that the preceding kanji is to be repeated, functioning similarly to a ditto mark in English. It is pronounced as though the kanji were written twice in a row, for example 色々 (iroiro "various") and 時々 (tokidoki "sometimes"). This mark also appears in personal and place names, as in the surname Sasaki (佐々木). This symbol is a simplified version of the kanji 仝 (variant of 同 dō "same").

Another frequently used symbol is ヶ (a small katakana "ke"), pronounced "ka" when used to indicate quantity (such as 六ヶ月, rokkagetsu "six months") or "ga" in place names like Kasumigaseki (霞ヶ関). This symbol is a simplified version of the kanji 箇.

Radical-and-stroke sorting

Main article: Radical-and-stroke sorting

Kanji, whose thousands of symbols defy ordering by convention such as is used with the Roman Alphabet, uses radical-and-stroke sorting to order a list of Kanji words. In this system, common components of characters are identified; these are called radicals in Chinese and logographic systems derived from Chinese, such as Kanji.

Characters are then grouped by their primary radical, then ordered by number of pen strokes within radicals. When there is no obvious radical or more than one radical, convention governs which is used for collation. For example, the Chinese character for "mother" (媽) is sorted as a thirteen-stroke character under the three-stroke primary radical (女) meaning "woman".

Kanji education

Japanese school children are expected to learn 1,006 basic kanji characters, the kyōiku kanji, before finishing the sixth grade. The order in which these characters are learned is fixed. The kyōiku kanji list is a subset of a larger list, originally of 1,945 kanji characters, in 2010 extended to 2,136, known as the jōyō kanji – characters required for the level of fluency necessary to read newspapers and literature in Japanese. This larger list of characters is to be mastered by the end of the ninth grade.^[12] Schoolchildren learn the characters by repetition and radical.

Students studying Japanese as a foreign language are often required by a curriculum to acquire kanji without having first learned the vocabulary associated with them. Strategies for these learners vary from copying-based methods to mnemonic-based methods such as those used in James Heisig's series Remembering the Kanji. Other textbooks use methods based on the etymology of the characters, such as Mathias and Habein's The Complete Guide to Everyday Kanji and Henshall's A Guide to Remembering Japanese Characters. Pictorial mnemonics, as in the text Kanji Pict-o-graphix, are also seen.

The Japanese government provides the Kanji kentei (日本漢字能力検定試験 Nihon kanji nōryoku kentei shiken; "Test of Japanese Kanji Aptitude") which tests the ability to read and write kanji. The highest level of the Kanji kentei tests about 6,000 kanji.

Notes

^ Taylor, Insup; Taylor, Maurice Martin (1995). Writing and literacy in Chinese, Korean, and Japanese. Amsterdam: John Benjamins Publishing Company. p. 305. ISBN 90-272-1794-7. http://books.google.com/books?id=WDw4gBaPjZgC.
^ Suski, P.M. (2011). The Phonetics of Japanese Language: With Reference to Japanese Script. p. 1. http://books.google.com/books?id=lyUc7oNgaqoC.
^ Malatesha Joshi, R.; Aaron, P.G. (2006). Handbook of orthography and literacy. New Jersey: Routledge. pp. 481–2. ISBN 0-8058-4652-2. http://books.google.com/books?id=nkXzdWSyBFgC&pg.
^ "Gold Seal (Kin-in)". Fukuoka City Museum. http://museum.city.fukuoka.jp/english/eb/eb_fr2.html. Retrieved August 3, 2011.
^ "Kokuji list", SLJ FAQ, http://www.sljfaq.org/afaq/kokuji-list.html .
^ James H Buck, Some Observations on kokuji, in The Journal-Newsletter of the Association of Teachers of Japanese, Vol. 6, No. 2 (Oct. 15, 1969), pp. 45–9.
^ Rogers, Henry. Writing Systems: A Linguistic Approach. Oxford: Blackwell, 2005. Print.
^ Introducing the SING Gaiji architecture, Adobe, http://www.adobe.com/support/downloads/detail.jsp?ftpID=2437 .
^ OpenType Technology Center, Adobe, http://www.adobe.com/devnet/opentype/ .
^ "Representation of Non-standard Characters and Glyphs", P5: Guidelines for Electronic Text Encoding and Interchange, TEI-C, http://www.tei-c.org/release/doc/tei-p5-doc/en/html/WD.html .
^ "TEI element g (character or glyph)", P5: Guidelines for Electronic Text Encoding and Interchange, TEI-C, http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-g.html .
^ J. Halpern, The Kodansha Kanji Learner's Dictionary, p. 38a (2006).

References

DeFrancis, John (1990). The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press. ISBN 0-8248-1068-6.
Hadamitzky, W., and Spahn, M., (1981) Kanji and Kana, Boston: Tuttle.
Hannas, William. C. (1997). Asia's Orthographic Dilemma. Honolulu: University of Hawaii Press. ISBN 0-8248-1892-X (paperback); ISBN 0-8248-1842-3 (hardcover).
Kaiser, Stephen (1991). Introduction to the Japanese Writing System. In Kodansha's Compact Kanji Guide. Tokyo: Kondansha International. ISBN 4-7700-1553-4.
Morohashi Tetsuji, 大漢和辞典/Daikanwajiten (Comprehensive Chinese–Japanese Dictionary) 1984–1986. Tokyo: Taishukan (generally regarded as the most authoritative kanji dictionary)
Mitamura, Joyce Yumi and Mitamura, Yasuko Kosaka (1997). Let's Learn Kanji. Tokyo: Kondansha International. ISBN 4-7700-2068-6.
Unger, J. Marshall (1996). Literacy and Script Reform in Occupation Japan: Reading Between the Lines. ISBN 0-19-510166-9

External links

Jim Breen's WWWJDIC server used to find Kanji from English or romanized Japanese
Kanji Explorer More than 13000 Kanji
KanjiQ – Kanji flashcard tool that runs on mobile phones.
JISHOP – Japanese-English computer kanji dictionary
KanjiLearn – Electronic set of 2135 two-sided kanji flashcards, as easy to use as paper flashcards.
Convert Kanji to Romaji, Hiragana—Converts Kanji and websites to forms that are easy to read and gives a word by word translation
Tangorin—Find kanji fast by selecting their elements
Dictionary of Kokuji in Japanese
Learn Japanese Kanji—How to write Kanji in Japanese
Drill the kanji—online Java tool (Asahi-net)
Kanji Alive—Online kanji learning tool in wide use at many universities, colleges and high-schools.
Real Kanji—Practice kanji using different typefaces.
Change in Script Usage in Japanese: A Longitudinal Study of Japanese Government White Papers on Labor, discussion paper by Takako Tomoda in the Electronic Journal of Contemporary Japanese Studies, August 19, 2005.
Kanji Dictionary, a kanji dictionary with a focus on compound-exploring.
Genetic Kanji, Etymologically-organized lists for learning kanji.
Kanji Networks, a kanji etymology dictionary
(Japanese)漢字研究・漢字資料 ("Kanji studies, Kanji data")—official documents about Kanji.
Japanese Kanji Dictionary—Each character is presented by a grade, stroke count, stroke order, phonetic reading and native Japanese reading. You can also listen to the pronunciation.
WWWJDIC Text Translator—Takes Japanese text and returns each word with pronunciation (hiragana) and a translation in English.
JavaDiKt — Open source kanji dictionary for desktop

Glyph conversion

Japanese language

Stages

Old Japanese
Early Middle Japanese
Late Middle Japanese
Early Modern Japanese
Modern Japanese

Dialects

Hokkaidō
Tōhoku
- Tsugaru
- Kesen
- Yamagata
Kantō
- Ibaraki
- Tokyo
Tōkai-Tōsan
- Nagaoka
- Nagoya
- Mino
- Hida
Hokuriku
Kansai
Chūgoku
Umpaku
Shikoku
- Iyo
- Tosa
- Sanuki
Hōnichi
Hichiku
- Saga
Satsugū
Hachijō
Okinawa

Literature

Writing system

Logograms	Kanji Kanbun

Kana	Hiragana Katakana Furigana Okurigana Gojūon Man'yōgana Hentaigana

Orthography	Punctuation Japanese orthography issues Kanazukai Historical kana orthography Jōdai Tokushu Kanazukai Modern kana usage Yotsugana

Grammar and
vocabulary

Phonology

Romanization

Chinese radicals according to the Kangxi Dictionary

1 stroke	1 一 2 丨 3 丶 4 丿 5 乙 6 亅

2 strokes	7 二 8 亠 9 人 10 儿 11 入 12 八 13 冂 14 冖 15 冫 16 几 17 凵 18 刀 19 力 20 勹 21 匕 22 匚 23 匸 24 十 25 卜 26 卩 27 厂 28 厶 29 又

3 strokes	30 口 31 囗 32 土 33 士 34 夂 35 夊 36 夕 37 大 38 女 39 子 40 宀 41 寸 42 小 43 尢 44 尸 45 屮 46 山 47 巛 48 工 49 己 50 巾 51 干 52 幺 53 广 54 廴 55 廿 56 弋 57 弓 58 彐 59 彡 60 彳

4 strokes	61 心 62 戈 63 戶 64 手 65 支 66 攴 67 文 68 斗 69 斤 70 方 71 无 72 日 73 曰 74 月 75 木 76 欠 77 止 78 歹 79 殳 80 毋 81 比 82 毛 83 氏 84 气 85 水 86 火 87 爪 88 父 89 爻 90 爿 91 片 92 牙 93 牛 94 犬

5 strokes	95 玄 96 玉 97 瓜 98 瓦 99 甘 100 生 101 用 102 田 103 疋 104 疒 105 癶 106 白 107 皮 108 皿 109 目 110 矛 111 矢 112 石 113 示 114 禸 115 禾 116 穴 117 立

6 strokes	118 竹 119 米 120 糸 121 缶 122 网 123 羊 124 羽 125 老 126 而 127 耒 128 耳 129 聿 130 肉 131 臣 132 自 133 至 134 臼 135 舌 136 舛 137 舟 138 艮 139 色 140 艸 141 虍 142 虫 143 血 144 行 145 衣 146 西

7 strokes	147 見 148 角 149 言 150 谷 151 豆 152 豕 153 豸 154 貝 155 赤 156 走 157 足 158 身 159 車 160 辛 161 辰 162 辵 163 邑 164 酉 165 釆 166 里

8 strokes	167 金 168 長 169 門 170 阜 171 隶 172 隹 173 雨 174 青 175 非

9 strokes	176 面 177 革 178 韋 179 韭 180 音 181 頁 182 風 183 飛 184 食 185 首 186 香

10 strokes	187 馬 188 骨 189 高 190 髟 191 鬥 192 鬯 193 鬲 194 鬼

11 strokes	195 魚 196 鳥 197 鹵 198 鹿 199 麥 200 麻

12 strokes	201 黃 202 黍 203 黑 204 黹

13 strokes	205 黽 206 鼎 207 鼓 208 鼠

14 strokes	209 鼻 210 齊

15 strokes	211 齒

16 strokes	212 龍 213 龜

17 strokes	214 龠

See also: List of Kangxi radicals

Types of writing systems

Overview	History of writing Grapheme

Lists	Writing systems undeciphered inventors Languages by writing system / by first written accounts

Types

Abjads

Numerals Aramaic Arabic Pitman shorthand Hebrew Jawi Nabataean Pahlavi Pegon Phoenician Proto-Canaanite Psalter Samaritan South Arabian Sogdian Syriac Tifinagh Ugaritic

Abugidas

Brahmic	Ahom Balinese Batak Baybayin Brāhmī Buhid Burmese Chakma Cham Devanāgarī Dhives Akuru Eastern Nagari Grantha Gujarati Gupta Gurmukhī Hanunó'o Javanese Kadamba Kaithi Kalinga Kannada Khmer Lanna Lao Lepcha Limbu Lontara Malayalam Meitei Mayek Mithilakshar Modi Mon Nāgarī Nepali Old Kawi Oriya Pallava 'Phags-pa Ranjana Rejang Rencong Śāradā Saurashtra Sinhala Siddhaṃ Soyombo Sundanese Sylheti Nagari Tagbanwa Tai Dam Tai Le Takri Tamil Telugu Thai Tibetan Tocharian Varang Kshiti

Others	Boyd's syllabic shorthand Canadian Aboriginal Ge'ez Japanese braille Kharoṣṭhī Meroitic Pollard Sorang Sompeng Tāna Thomas Natural Shorthand

Alphabets

Linear	Armenian Avestan Bassa Vah Borama Coptic Cyrillic Deseret Duployan shorthand Eclectic shorthand Elbasan Fraser Gabelsberger shorthand Georgian Glagolitic Gothic Gregg shorthand Greek Greco-Iberian alphabet Hangul International Phonetic Kaddare Latin Manchu Mandaic Mongolian Neo-Tifinagh New Tai Lue N'Ko Ogham Ol Chiki Old Hungarian Old Italic Old Permic Orkhon Osmanya Runic Shavian alphabet Visible Speech Vithkuqi

Non-linear	Braille Hebrew Korean Maritime flags Morse code New York Point Semaphore line Flag semaphore Moon type

Ideo/Pictograms

Aztec Blissymbol DanceWriting Dongba Míkmaq New Epoch Notation Painting Nsibidi SignWriting

Logograms

Chinese	Traditional Simplified Hanja Hán tự Kanji

Chinese-based	Chữ Nôm Jurchen Khitan large script Tangut Zhuang

Other logo-syllabic	Anatolian Cuneiform Maya Yi

Logo-consonantal	Demotic Hieratic Hieroglyphs

Numerals	Hindu-Arabic Abjad Greek (Attic) Roman

Semi-syllabaries

Full	Celtiberian Northeastern Iberian Southeastern Iberian

Redundant	Southwest Paleohispanic Pahawh Hmong Zhùyīn fúhào Khitan small script

Syllabaries

Afaka Cherokee Cypriot Geba Hiragana Katakana Kikakui Kpelle Linear B Man'yōgana Nüshu Old Persian Cuneiform Vai Woleai Yi Yugtun