Case folding

From Wikipedia, the free encyclopedia

Case folding is a term denoting the conversion of all characters in a string to lower or upper-case, typically to make case-insensitive comparisons.

[edit] Case Folding in some high-level languages

Most, if not all, BASIC dialects provide these basic functions:

UpperA$ = UCASE$("a")
 LowerA$ = LCASE$("A")

C and C++, as well as any C-like language that conforms to its standard library, provide these functions in the file ctype.h:

char upperA = toupper('a');
 char lowerA = tolower('A');

[edit] Algorithms to fold case

Case folding is different with different character sets. In ASCII case can be folded in the following way, in C:

#define toupper(c) islower(c) ? (c) - 'a' + 'A' : (c)
#define tolower(c) isupper(c) ? (c) - 'A' + 'a' : (c)

This only works because the alphanumeric letters are consecutive. This would not work, for instance, with EBCDIC.

[edit] Unicode case folding


Unicode defines case folding through the three case mapping properties of each character: uppercase, lowercase and titlecase. These properties relate all characters in scripts with differring cases to the other case variants of the character.