A GADDAG is a data structure presented by Steven Gordon in 1994, for use in generating moves for Scrabble and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a Directed Acyclic Word Graph (DAWG), such as the one used by Maven. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries.[1]
Quackle uses a GADDAG to generate moves.
Contents |
A GADDAG is a specialization of a Trie, containing states and branches to other GADDAGs. It is distinct for its storage of every reversed prefix of every word in a dictionary. This means every word has as many representations as it does letters; since the average word in most Scrabble regulation dictionaries is 5 letters long, this makes the GADDAG about 5 times as big as a simple DAWG.
For any word in a dictionary that is formed by a non-empty prefix x and a suffix y, a GADDAG contains a direct, deterministic path for any string REV(x)৳y, where ৳ is a concatenation operator.
For example, for the word "explain," a GADDAG will contain direct paths to the strings "e৳xplain," "xe৳plain," "pxe৳lain," "lpxe৳ain," "alpxe৳in," "ialpxe৳n," and "nialpxe."
Any move-generation algorithm must adhere to three types of constraints:
The DAWG algorithms speed up and take advantage of the second and third constraint: the DAWG is build around the dictionary, and you only traverse it using tiles from your rack. It fails, however, to address the first constraint: supposing you want to 'hook into' the letter P in HARPY, and the board has 2 spaces before the P, you must search the dictionary for all words containing letters from your rack where the third letter is P. This is non-deterministic when searching through the DAWG, as many searches through the trie will be fruitless.
This is addressed by the GADDAG's storage of prefixes: by traversing the P branch of a GADDAG, you see all words that have a P somewhere in their composition, and can "travel up" the prefix to form the word with tiles in your rack. To use the example from the definition section, to hook onto the P, you will immediately see a path for pxe৳lain. You add tiles from your rack while appropriate, traveling backwards through the word until you encounter the ৳, meaning you've completed the prefix. You complete the move by adding to the front of the word with the suffix.