Talk:International Components for Unicode

From Wikipedia, the free encyclopedia

[edit] October 2006

I've just expanded the article a little and made some corrections. I used 2 sources not referenced in the article:

At one stage, IBM sold a C++ kit called the "Taligent Internationalization Library". I don't know if this came from CommonPoint, or was an early name for ICU4C.

Perhaps the article should mention a major design difference between ICU and C locales? In ICU, locales are just labels that a program can use to load an appropriate formatter, date converter, string bundle, etc. In C, locales carry all the locale-specific information with them, so one setlocale() call can change all locale-related settings.

Cheers, CWC(talk) 07:42, 15 October 2006 (UTC)

[edit] Neutrality

User Hdante (talk ยท contribs) has tagged the article with Template:POV-check for the words "much richer", in the sentence

ICU provides much richer internationalization facilities than the standard libraries for C or C++, and most operating systems.

Being richer than standard C or C++ is quite easy. Being richer than "standard Unix" isn't much harder.

Presumably Hdante is concerned that ICU is not "much richer" than operating systems such as Windows (see Uniscribe) and OS X (see ATSUI). Note that Uniscribe and ATSUI both provide rendering, whereas ICU does not. Can someone familiar with Uniscribe and ATSUI tell us how they compare to ICU for text processing? Cheers, CWC(talk) 03:10, 5 November 2006 (UTC)

[edit] The term "richer" is in question!?

C and C++ have almost no internationalization features when compared to .Net or Java. The POSIX API specification or Unix, which some people confuse to be a part of the C programming language, has some basic internationalization features. Unfortunately, most of the POSIX internationalization framework requires a whole application to use one locale at a time through setlocale instead of allowing multiple locales in use for a multithreaded application.

C and C++ do not include or promise the following:

  • A Unicode based regular expression engine in order to handle text in multiple languages
  • Unicode based collation algorithm and language sensitive string searching
  • Handle BiDi issues
  • Handle all Unicode properties needed for proper handling of text in multiple languages
  • Calendars besides the Gregorian calendar
  • An extensive timezone API. The majority of POSIX implementations don't even provide the full Olson timezone ID or rules for the timezone.
  • Promise that Unicode is always available. There are many legacy codepages that are not portable enough to use reliably in source code.
  • ... and many other features.

If you're a pure Windows programmer, it's more difficult to say that ICU has a richer internationalization framework than Windows. Windows has a great internationalization framework integrated into the OS that is available throughout C and C++. Mac OS X also has some great internationalization features, but Mac OS X already uses ICU for many of these features (so it's not a useful point to compare ICU to ICU in this case).

There is a reason why "most" is used in this sentence. There are many other operating systems besides Windows that don't provide good internationalization features without ICU, like Linux, Free BSD, Net BSD, Open BSD, z/OS, i5/OS, Palm OS, Solaris, AIX and many other lesser known operating systems. This is why many companies already use ICU. If ICU didn't have a richer internationalization API than what C/C++ provided, no one would be using it. ICU is used by many companies and open source projects out there.

ICU's layout engine is just a small part of ICU4C.

In general, I agree with CWC's comments.

User:UTF-8