Talk:CESU-8
From Wikipedia, the free encyclopedia
[edit] rewrite
I have started a rewrite at CESU-8/temp as per the instructions in the copyvio template i am reporting it here. Plugwash 13:51, 14 July 2005 (UTC)
- Temp page has replaced the main article. RedWolf 03:34, July 22, 2005 (UTC)
[edit] examples
Can yu please give an example of a string encoded in CESU-8 and in which case it is treated in a special way? -- Nichtich 00:23, 27 October 2005 (UTC)
[edit] Advantages
What are the advantages to UTF-8? --Apoc2400 09:15, 6 December 2006 (UTC)
- I can think of three
- 1: if used for serialisation of strings in languages where strings are natively UTF-16 it won't break if someone decides to use string types for something other than valid UTF-16 data.
- 2: if sorted using a byte orientated sort the result will be the same as using a word orientated sort on UTF-16
- 3: conversion between CESU-8 and UTF-16 is simpler than conversion between UTF-8 and UTF-16
- But as the name suggests the main reason it is used is compatibility with old software. Software that was built to use UCS-2 internally and UTF-8
inexternally and that does not explicitly reject surrogate codepoints can be made to store surrogates (for supplementry characters) by feeding its external interfaces with CESU-8 data (which will be converted to UTF-16 for storage internally). Plugwash 23:07, 7 December 2006 (UTC)