T9 (predictive text)
T9, which stands for Text on 9 keys, is a U.S.-patented[1][2][3] predictive text technology for mobile phones (specifically those that contain a 3x4 numeric keypad), originally developed by Tegic Communications, now part of Nuance Communications.[4]
T9 was used on phones from Verizon Wireless, NEC, Nokia, Samsung Electronics, Siemens, Sony Ericsson, Sanyo, Sagem and others. It was also used by Texas Instruments PDA Avigo during the late 1990s. Its main competitors are iTap created by Motorola, SureType created by RIM, Eatoni's LetterWise and WordWise, and Intelab's Tauto.
Design
T9's objective is to make it easier to type text messages. It allows words to be entered by a single keypress for each letter, as opposed to the multi-tap approach used in conventional mobile phone text entry, in which several letters are associated with each key, and selecting one letter often requires multiple keypresses.
It combines the groups of letters on each phone key with a fast-access dictionary of words. It looks up in the dictionary all words corresponding to the sequence of keypresses and orders them by frequency of use. As T9 "gains familiarity" with the words and phrases the user commonly uses, it speeds up the process by offering the most frequently used words first and then lets the user access other choices with one or more presses of a predefined "Next" key.
The dictionary can be expanded by adding missing words, enabling them to be recognized in the future. After introducing a new word, the next time the user tries to produce that word T9 will add it to the predictive dictionary.
The user database (UDB) can be expanded via multi-tap. The implementation of the user database depends on the version of T9 and how T9 is actually integrated on the device. Some phone manufacturers implement a permanent user database, while others implement one for the duration of the session.
Features
Some T9 implementations feature smart punctuation. This feature allows the user to insert sentence and word punctuation using the '1'-key. Depending on the context, smart punctuation inserts sentence punctuation (period or 'full stop') or embedded punctuation (period or hyphen) or word punctuation (apostrophe in can't, won't, isn't, and the possessive 's). Depending on the language, T9 also supports word breaking after punctuation to support clitics such as l' and n' in French and 's in English.
The UDB is an optional feature which allows words that were explicitly entered by the user to be stored for future reference. The number of words stored depends on the implementation as well as the language.
In later versions of T9, the order of the words presented adapts to the usage pattern. For instance, in English, 4663 matches "good", "home", "gone", "hood", etc. Such combinations are known as textonyms; e.g., "home" is referred to as a textonym of "good". When the user uses "home" more often than "good", eventually the two words will switch position. Information about common word combinations can also be learned from the user and stored for future predictions.
For words entered by the user, word completion can be enabled. When the user enters matching key-presses, in addition to words and stems, the system will also provide completions.
In later versions of T9, the user can select a primary and secondary language and matches from both languages are presented. This enables users to write messages in their native as well as a foreign language.
Some implementations also learn commonly used word pairs and provide word prediction (e.g. if one often writes "eat food", after entering "eat" the phone will suggest "food" and it can be confirmed by simply pressing next).
Another powerful feature is its ability to automatically recognise and correct typing/texting errors, by looking at neighbouring keys on the keypad to ascertain an incorrect keypress. For example, the word "testing" would be entered with the key combination "8378464". Entering the same number but with two incorrect keypresses of neighbouring keys, e.g., "8278494" still results in T9 suggesting the words "tasting" (8278464), "testing" (8378464), and "tapping" (8277464).
Algorithm
In order to achieve compression ratios of close to 1 byte per word, T9 uses an optimized algorithm which maintains the order of words, and partial words (also known as stems) but because of this compression, it over-generates words which are sometimes visible to the user as "junk words". This is a side effect of the requirements for small database sizes on the lower end embedded devices.
Examples of use
On a phone with a simple numeric keypad, each time a key (1-9) is pressed (when in a text field), the algorithm returns a guess at what letters are most likely for the keys pressed to that point. For example, to enter the word 'the', one would press 8 then 4 then 3, and the display would display 't' then 'th' then 'the'. If the slightly more unusual (in common usage) word 'fore' were intended, one would enter 3, 6, 7, 3 and the predictive algorithm may select 'Ford'. Pressing the key for 'next' (typically the '*' key) might bring up 'dose', and finally 'fore'. If 'fore' is selected, then the next time the user presses the sequence 3-6-7-3, this would be more likely to be the first word displayed. If the word "Felix" was intended, however, when entering 33549, the display shows 'E', then 'De', 'Del', 'Deli', and 'Felix.' This is an example of a letter changing while entering words.