Collating sequence
From Wikipedia, the free encyclopedia
The term collating sequence refers to the order in which character strings should be placed when sorting them.
A common example is the familiar "alphabetic order," in which "Alfred" occurs before "Zeus" because "A" occurs before "Z" in the English alphabet. But there are other issues that a collating sequence must consider, say in a computer system.
- Upper and Lower-Case: Should "Alfred" be placed before or after "alfred"? Generally one would say "no," because an upper-case "A" and a lower-case "a" are usually considered to be the same letter. But it may be that you want to sort the records otherwise.
- National characters, accents, tildes: Various languages use these marks over and around letters, but once again the speakers of the language might consider the characters to be "the same."
In a computer system, each letter is necessarily assigned a unique numeric code (as in the ASCII or Unicode character set), but the proper and customary ordering of strings is not performed by a simple numeric comparison of those codes. Rather, the ordering is determined by reference to the collating sequence.