Thai Industrial Standard 620-2533
Thai Industrial Standard 620-2533, commonly referred to as TIS-620, is the most common character set and character encoding for the Thai language. The standard is published by the Thai Industrial Standards Institute (TISI), an organ of the Ministry of Industry under the Royal Thai Government, and is the sole official standard for encoding Thai in Thailand. The descriptive name of the standard is "Standard for Thai Character Codes for Computers" (Thai: รหัสสำหรับอักขระไทยที่ใช้กับคอมพิวเตอร์). "2533" refers to year 2533 of the Buddhist Era (1990), the year the present version of the standard was published; a previous revision, TIS 620-2529 (1986), is now obsolete.
TIS-620 is the IANA preferred charset name for TIS-620, and that charset name is used also for ISO/IEC 8859-11 (which adds a no-break space character at 0xA0, which is unassigned in TIS-620). When the IANA name is used the codes are supplemented with the C0 and C1 control codes from ISO/IEC 6429.
Structure
TIS-620 is a conventionally structured Extended ASCII national character set that retains full compatibility with 7-bit ASCII and uses the 8-bit range hex A1 to FB for encoding the Thai alphabet. Due to the complex combining nature of Thai vowels and diacritics, TIS-620 is intended for information interchange only, and an additional display engine is required to compose characters correctly.
Variants
A nearly identical version of TIS-620 has been adopted as ISO/IEC 8859-11 in 2001, the sole difference being that ISO/IEC 8859-11 defines hex A0 as a non-breaking space, while TIS-620 leaves it undefined but reserved. (In practice, this small distinction is usually ignored.)
The ISO/IEC 8859-11 set has also been registered as ISO-IR-166 by Ecma International, but this variation adds explicit escape codes for signaling the beginning and end of Thai character sequences.
The TIS-620 character set ordering has been used essentially as is within Unicode (ISO/IEC 10646) as well. Unicode's Thai range is U+0E01 through U+0E7F, and TIS-620 Thai characters can be converted to UTF-16 simply by prefixing each byte with 0E and subtracting hex A0 from the value.
Character set
Legend:
Alphabetic
Control character
Numeric digit
Punctuation
|
Extended punctuation
Graphic character
International
Undefined
|
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ |
||||||||||||||||
1_ |
||||||||||||||||
2_ |
SP 0020 32 |
! 0021 33 |
" 0022 34 |
# 0023 35 |
$ 0024 36 |
% 0025 37 |
& 0026 38 |
' 0027 39 |
( 0028 40 |
) 0029 41 |
* 002A 42 |
+ 002B 43 |
, 002C 44 |
- 002D 45 |
. 002E 46 |
/ 002F 47 |
3_ |
0 0030 48 |
1 0031 49 |
2 0032 50 |
3 0033 51 |
4 0034 52 |
5 0035 53 |
6 0036 54 |
7 0037 55 |
8 0038 56 |
9 0039 57 |
: 003A 58 |
; 003B 59 |
< 003C 60 |
= 003D 61 |
> 003E 62 |
? 003F 63 |
4_ |
@ 0040 64 |
A 0041 65 |
B 0042 66 |
C 0043 67 |
D 0044 68 |
E 0045 69 |
F 0046 70 |
G 0047 71 |
H 0048 72 |
I 0049 73 |
J 004A 74 |
K 004B 75 |
L 004C 76 |
M 004D 77 |
N 004E 78 |
O 004F 79 |
5_ |
P 0050 80 |
Q 0051 81 |
R 0052 82 |
S 0053 83 |
T 0054 84 |
U 0055 85 |
V 0056 86 |
W 0057 87 |
X 0058 88 |
Y 0059 89 |
Z 005A 90 |
[ 005B 91 |
\ 005C 92 |
] 005D 93 |
^ 005E 94 |
_ 005F 95 |
6_ |
` 0060 96 |
a 0061 97 |
b 0062 98 |
c 0063 99 |
d 0064 100 |
e 0065 101 |
f 0066 102 |
g 0067 103 |
h 0068 104 |
i 0069 105 |
j 006A 106 |
k 006B 107 |
l 006C 108 |
m 006D 109 |
n 006E 110 |
o 006F 111 |
7_ |
p 0070 112 |
q 0071 113 |
r 0072 114 |
s 0073 115 |
t 0074 116 |
u 0075 117 |
v 0076 118 |
w 0077 119 |
x 0078 120 |
y 0079 121 |
z 007A 122 |
{ 007B 123 |
| 007C 124 |
} 007D 125 |
~ 007E 126 |
|
8_ |
||||||||||||||||
9_ |
||||||||||||||||
A_ |
ก 0E01 161 |
ข 0E02 162 |
ฃ 0E03 163 |
ค 0E04 164 |
ฅ 0E05 165 |
ฆ 0E06 166 |
ง 0E07 167 |
จ 0E08 168 |
ฉ 0E09 169 |
ช 0E0A 170 |
ซ 0E0B 171 |
ฌ 0E0C 172 |
ญ 0E0D 173 |
ฎ 0E0E 174 |
ฏ 0E0F 175 | |
B_ |
ฐ 0E10 176 |
ฑ 0E11 177 |
ฒ 0E12 178 |
ณ 0E13 179 |
ด 0E14 180 |
ต 0E15 181 |
ถ 0E16 182 |
ท 0E17 183 |
ธ 0E18 184 |
น 0E19 185 |
บ 0E1A 186 |
ป 0E1B 187 |
ผ 0E1C 188 |
ฝ 0E1D 189 |
พ 0E1E 190 |
ฟ 0E1F 191 |
C_ |
ภ 0E20 192 |
ม 0E21 193 |
ย 0E22 194 |
ร 0E23 195 |
ฤ 0E24 196 |
ล 0E25 197 |
ฦ 0E26 198 |
ว 0E27 199 |
ศ 0E28 200 |
ษ 0E29 201 |
ส 0E2A 202 |
ห 0E2B 203 |
ฬ 0E2C 204 |
อ 0E2D 205 |
ฮ 0E2E 206 |
ฯ 0E2F 207 |
D_ |
ะ 0E30 208 |
◌ั 0E31 209 |
า 0E32 210 |
ำ 0E33 211 |
◌ิ 0E34 212 |
◌ี 0E35 213 |
◌ึ 0E36 214 |
◌ื 0E37 215 |
◌ุ 0E38 216 |
◌ู 0E39 217 |
◌ฺ 0E3A 218 |
฿ 0E3F 223 | ||||
E_ |
เ 0E40 224 |
แ 0E41 225 |
โ 0E42 226 |
ใ 0E43 227 |
ไ 0E44 228 |
ๅ 0E45 229 |
ๆ 0E46 230 |
◌็ 0E47 231 |
◌่ 0E48 232 |
◌้ 0E49 233 |
◌๊ 0E4A 234 |
◌๋ 0E4B 235 |
◌์ 0E4C 236 |
◌ํ 0E4D 237 |
◌๎ 0E4E 238 |
๏ 0E4F 239 |
F_ |
๐ 0E50 240 |
๑ 0E51 241 |
๒ 0E52 242 |
๓ 0E53 243 |
๔ 0E54 244 |
๕ 0E55 245 |
๖ 0E56 246 |
๗ 0E57 247 |
๘ 0E58 248 |
๙ 0E59 249 |
๚ 0E5A 250 |
๛ 0E5B 251 |
||||
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F |
In the table above, 20 is the regular SPACE character. Code values 00-1F, 7F, 80-9F, A0, DB-DE and FC-FF are not assigned to characters by TIS-620.
Code values D1, D4-DA, E7-EE are combining characters.
Further reading
- Flohr, Guido (2016) [2006]. "Locale::RecodeData::TIS_620 - Conversion routines for TIS-620". CPAN libintl-perl. 1.0. Archived from the original on 2017-01-14. Retrieved 2017-01-14.
- https://www.math.nmsu.edu/~mleisher/Software/csets/TIS620.TXT
External links
- Official reference (in Thai)
- Announcement in Royal Gazette of TIS 620-2533 and TIS 620-2529
- Mapping of TIS-620 to ISO 10646 at the Wayback Machine (archived June 5, 2013) (not authoritative)