An endgame tablebase is a computerized database that contains precalculated exhaustive analysis of a chess endgame position. It is typically used by a computer chess engine during play, or by a human or computer that is retrospectively analysing a game that has already been played.
The tablebase contains the game-theoretical value (win, loss, or draw) of each possible move in each possible position, and how many moves it would take to achieve that result with perfect play. Thus, the tablebase acts as an oracle, always providing the optimal moves. Typically the database records each possible position with certain pieces remaining on the board, and the best moves with White to move and with Black to move.
Tablebases are generated by retrograde analysis, working backwards from a checkmated position. Tablebases have solved chess for every position with six or fewer pieces (including the two kings). The solutions have profoundly advanced the chess community's understanding of endgame theory. Some positions which humans had analyzed as draws were proved to be winnable; the tablebase analysis could find a mate in more than a hundred moves, far beyond the horizon of humans, and beyond the capability of a computer during play. Tablebases have enhanced competitive play and facilitated the composition of endgame studies. They provide a powerful analytical tool.
Endgame tablebases for other board games like checkers,[1] chess variants[2] or Nine Men's Morris[3] exist, but without a specific mention of the game, one is talking about chess.
|
This article uses algebraic notation to describe chess moves. |
Physical limitations of computer hardware aside, in principle it is possible to solve any game under the condition that the complete state is known and there is no random chance. Strong solutions, i.e. algorithms that can produce perfect play from any position,[4] are known for some simple games such as Tic Tac Toe (draw with perfect play) and Connect Four (first player wins). Weak solutions exist for somewhat more complex games, such as checkers (with perfect play on both sides the game is known to be a draw, but it is not known for every position created by less-than-perfect play what the perfect next move would be). Other games, such as chess (from the starting position) and Go, have not been solved because their game complexity is too vast for computers to evaluate all possible positions. To reduce the game complexity, researchers have modified these complex games by reducing the size of the board, or the number of pieces, or both.
Computer chess is one of the oldest domains of artificial intelligence, having begun in the early 1930s. Claude Shannon proposed formal criteria for evaluating chess moves in 1949. In 1951, Alan Turing designed a primitive chess playing program, which assigned values for material and mobility; the program "played" chess based on Turing's manual calculations.[5] However, even as competent chess programs began to develop, they exhibited a glaring weakness in playing the endgame. Programmers added specific heuristics for the endgame – for example, the king should move to the center of the board.[6] However, a more comprehensive solution was needed.
In 1965, Richard Bellman proposed the creation of a database to solve chess and checkers endgames using retrograde analysis.[7][8] Instead of analyzing forward from the position currently on the board, the database would analyze backward from positions where one player was checkmated or stalemated. Thus, a chess computer would no longer need to analyze endgame positions during the game because they were solved beforehand. It would no longer make mistakes because the tablebase always played the best possible move.
In 1970, Thomas Ströhlein published a doctoral thesis[9][10] with analysis of the following classes of endgame: KQK, KRK, KPK, KQKR, KRKB, and KRKN.[11] In 1977 the KQKR database was used in a match versus Grandmaster Walter Browne, see Computer chess#Using endgame databases.
Ken Thompson and others helped extend tablebases to cover all four- and five-piece endgames, including in particular KBBKN, KQPKQ, and KRPKR.[12][13] Lewis Stiller published a thesis with research on some six-piece tablebase endgames in 1995.[14][15]
More recent contributors have included the following people:
The tablebases of all endgames with up to six pieces are available for free download, and may also be queried using web interfaces (see the external links below). Nalimov tablebase requires more than one terabyte of storage space.[16][17]
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
Before creating a tablebase, a programmer must choose a metric of optimality – in other words, he must define at what point a player has "won" the game. Every position can be defined by its distance (i.e. the number of moves) from the desired endpoint. Two metrics are generally used:
Haworth has discussed two other metrics, namely "depth to zeroing-move" (DTZ) and "depth by the rule" (DTR). These metrics support the fifty-move rule, but DTR tablebases have not yet been computed, and DTZ tablebases have not yet been generally released to the public.[18]
The difference between DTC and DTM can be understood by analyzing the diagram at right. How White should proceed depends on which metric is used.
Metric | Play | DTC | DTM |
---|---|---|---|
DTC | 1. Qxd1 Kc8 2. Qd2 Kb8 3. Qd8 mate | 1 | 3 |
DTM | 1. Qc7+ Ka8 2. Qa7 mate | 2 | 2 |
According to the DTC metric, White should capture the rook because that leads immediately to a position which will certainly win (DTC = 1), but it will take two more moves actually to checkmate (DTM = 3). In contrast according to the DTM metric, White mates in two moves, so DTM = DTC = 2.
This difference is typical of many endgames. Usually DTC is smaller than DTM, but the DTM metric leads to the quickest checkmate. Exceptions occur where the weaker side has only a king, and in the unusual endgame of two knights versus one pawn; then DTC = DTM because either there is no defending material to capture or capturing the material does no good. (Indeed, capturing the defending pawn in the latter endgame results in a draw.)
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
Once a metric is chosen, the first step is to generate all the positions with a given material. For example, to generate a DTM tablebase for the endgame of king and queen versus king (KQK), the computer must describe approximately 40,000 unique legal positions.
Levy and Newborn explain that the number 40,000 derives from a symmetry argument. The Black king can be placed on any of ten squares: a1, b1, c1, d1, b2, c2, d2, c3, d3, and d4 (see diagram). On any other square, its position can be considered equivalent by symmetry of rotation or reflection. Thus, there is no difference whether a Black king in a corner resides on a1, a8, h8, or h1. Multiply this number of 10 by at most 60 (legal remaining) squares for placing the White king and then by at most 62 squares for the White queen. The product 10×60×62 = 37,200. Several hundred of these positions are illegal, impossible, or symmetrical reflections of each other, so the actual number is somewhat smaller.[19][20]
For each position, the tablebase evaluates the situation separately for White-to-move and Black-to-move. Assuming that White has the queen, almost all the positions are White wins, with checkmate forced in not more than ten moves. Some positions are draws because of stalemate or the unavoidable loss of the queen.
Each additional piece added to a pawnless endgame multiplies the number of unique positions by about a factor of sixty which is the approximate number of squares not already occupied by other pieces.
Endgames with one or more pawns increase the complexity because the symmetry argument is reduced. Since pawns can move forward but not sideways, rotation and vertical reflection of the board produces a fundamental change in the nature of the position.[21] The best calculation of symmetry is achieved by limiting one pawn to 24 squares in the rectangle a2-a7-d7-d2. All other pieces and pawns may be located in any of the 64 squares with respect to the pawn. Thus, an endgame with pawns has a complexity of 24/10 = 2.4 times a pawnless endgame with the same number of pieces.
Tim Krabbé explains the process of generating a tablebase as follows:
"The idea is that a database is made with all possible positions with a given material [note: as in the preceding section]. Then a subdatabase is made of all positions where Black is mated. Then one where White can give mate. Then one where Black cannot stop White giving mate next move. Then one where White can always reach a position where Black cannot stop him from giving mate next move. And so on, always a ply further away from mate until all positions that are thus connected to mate have been found. Then all of these positions are linked back to mate by the shortest path through the database. That means that, apart from 'equi-optimal' moves, all the moves in such a path are perfect: White's move always leads to the quickest mate, Black's move always leads to the slowest mate."[22]
The retrograde analysis is only necessary from the checkmated positions. Other positions need not be worked from because every position that is not reached from a checkmated position is a draw.[23]
Figure 1 illustrates the idea of retrograde analysis. White mates in two moves with 1. Kc6, leading to the position in Figure 2. Then if 1...Kb8 2. Qb7 mate, and if 1...Kd8 2. Qd7 mate (Figure 3).
Figure 3, before White's second move, is defined as "mate in one ply." Figure 2, after White's first move, is "mate in two ply," regardless of how Black plays. Finally, the initial position in Figure 1 is "mate in three ply" (i.e., two moves) because it leads directly to Figure 2, which is already defined as "mate in two ply." This process, which links a current position to another position that could have existed one ply earlier, can continue indefinitely.
Each position is evaluated as a win or loss in a certain number of moves. At the end of the retrograde analysis, positions which are not designated as wins or losses are necessarily draws.
Figure 1
|
Figure 2
|
Figure 3
|
After the tablebase has been generated, and every position has been evaluated, the result must be verified independently. The purpose is to check the self-consistency of the tablebase results.[24]
For example, in Figure 1 above, the verification program sees the evaluation "mate in three ply (Kc6)." It then looks at the position in Figure 2, after Kc6, and sees the evaluation "mate in two ply." These two evaluations are consistent with each other. If the evaluation of Figure 2 were anything else, it would be inconsistent with Figure 1, so the tablebase would need to be corrected.
A four-piece tablebase must rely on three-piece tablebases that could result if one piece is captured. Similarly, a tablebase containing a pawn must be able to rely on other tablebases that deal with the new set of material after pawn promotion to a queen or other piece. The retrograde analysis program must account for the possibility of a capture or pawn promotion on the previous move.[25]
Tablebases assume that castling is not possible for two reasons. First, in practical endgames, this assumption is almost always correct. (However, castling is allowed by convention in composed problems and studies.) Second, if the king and rook are on their original squares, castling may or may not be allowed. Because of this ambiguity, it would be necessary to make separate evaluations for states in which castling is or is not possible.
The same ambiguity exists for the en passant capture, since the possibility of en passant depends on the opponent's previous move. However, practical applications of en passant occur frequently in pawn endgames, so tablebases account for the possibility of en passant for positions where both sides have at least one pawn.
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
According to the method described above, the tablebase must allow the possibility that a given piece might occupy any of the 64 squares. In some positions, it is possible to restrict the search space without affecting the result. This saves computational resources and enables searches which would otherwise be impossible.
An early analysis of this type was published in 1987, in the endgame KRP(a2)KBP(a3), where the Black bishop moves on the dark squares (see example position at right).[26] In this position, we can make the following a priori assumptions:
The result of this simplification is that, instead of searching for 48 * 47 = 2,256 permutations for the pawns' locations, there is only one permutation. Reducing the search space by a factor of 2,256 facilitates a much quicker calculation.
Bleicher has designed a commercial program called "Freezer," which allows users to build new tablebases from existing Nalimov tablebases with a priori information. The program can produce a tablebase for seven-piece positions with blocked pawns, even though seven-piece tablebases are generally not available.[28]
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
In correspondence chess, a player may consult a chess computer for assistance, provided that the etiquette of the competition allows this. A six-piece tablebase (KQQKQQ) was used to analyze the endgame that occurred in the correspondence game Kasparov versus The World.[29] Players have also used tablebases to analyze endgames from over-the-board play after the game is over.
Competitive players need to know that tablebases ignore the fifty-move rule. According to that rule, if fifty moves have passed without a capture or a pawn move, either player may claim a draw. FIDE changed the rules several times, starting in 1974, to allow one hundred moves for endgames where fifty moves were insufficient to win. In 1988, FIDE allowed seventy-five moves for KBBKN, KNNKP, KQKBB, KQKNN, KRBKR, and KQPKQ with the pawn on the seventh rank, because tablebases had uncovered positions in these endgames requiring more than fifty moves to win. In 1992, FIDE canceled these exceptions and restored the fifty-move rule to its original standing.[18] Thus a tablebase may identify a position as won or lost, when it is in fact drawn by the fifty-move rule.
Haworth has designed a tablebase that produces results consistent with the fifty-move rule. However most tablebases search for the theoretical limits of forced mate, even if it requires several hundred moves.
The knowledge contained in tablebases affords the computer a tremendous advantage in the endgame. Not only can computers play perfectly within an endgame, but they can simplify to a winning tablebase position from a more complicated endgame.[30] For the latter purpose, some programs use "bitbases" which give the game-theoretical value of positions without the number of moves until conversion or mate — that is, they only reveal whether the position is won, lost or draw. Sometimes even this data is compressed and the bitbase reveals only whether a position is won or not, making no difference between a lost and a draw game.[23]
However, some computer chess experts have observed practical drawbacks.[31] In addition to ignoring the fifty-move rule, a computer in a difficult position might avoid the losing side of a tablebase ending even if the opponent cannot practically win without himself knowing the tablebase. The adverse effect could be a premature resignation, or an inferior line of play that loses with less resistance than a play without tablebase might offer.
Another drawback is that tablebases require a lot of memory to store the many thousands of positions. The Nalimov tablebases, which use special-purpose compression technique, require 7.05 GB of hard disk space for all five-piece endings. The six-piece endings require approximately 1.2 terabytes.[32][33] Nalimov seven-piece tablebases require more hard drive storage capacity and RAM to operate than will be practical to use for the foreseeable future. Bitbases, however, take much less space. Shredderbases, for example, used by the Shredder program, compress all three, four and five piece bases into 157 MB. This is a mere fraction of the 7.05 GB that the Nalimov tablebases require.[34]
Some computers play better overall if their memory is devoted instead to the ordinary search and evaluation function. Modern computers analyze far enough ahead conventionally to handle the elementary endgames without needing tablebases (i.e. without suffering from the horizon effect). It is only for more sophisticated positions that the tablebase will improve significantly over the ordinary computer.
In contexts where the fifty-move rule may be ignored, tablebases have answered longstanding questions about whether certain combinations of material are wins or draws. The following interesting results have emerged:
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
For some years, this position held the record for the longest computer-generated forced mate (Otto Blathy had composed a "mate in 292 moves" problem already in 1889).[44] However, in May 2006, Bourzutschky and Konoval discovered a KQNKRBN position with an astonishing DTC of 517 moves. This was more than twice as long as Stiller's maximum, and almost 200 moves beyond the previous record of a 330 DTC for a position of KQBNKQB_1001. Bourzutschky wrote, "This was a big surprise for us and is a great tribute to the complexity of chess."[45][46]
In August 2006, Bourzutschky released preliminary results from his analysis of the following seven-piece endgames: KQQPKQQ, KRRPKRR, and KBBPKNN.[24]
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
Many positions are winnable although at first sight they appear to be non-winnable. For example, this position is a win for Black in 154 moves (during which white pawn is liquidated after around 80 moves) (absolutely non-typical for this kind of endgame) (see Six-Man Endgame Server - http://www.k4it.de/index.php?topic=egtb&lang=en).
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
a | b | c | d | e | f | g | h | ||
8 | 8 | ||||||||
7 | 7 | ||||||||
6 | 6 | ||||||||
5 | 5 | ||||||||
4 | 4 | ||||||||
3 | 3 | ||||||||
2 | 2 | ||||||||
1 | 1 | ||||||||
a | b | c | d | e | f | g | h |
Since many composed endgame studies deal with positions that exist in tablebases, their soundness can be checked using the tablebases. Some studies have been cooked, i.e. proved unsound, by the tablebases. That can be either because the composer's solution does not work, or else because there is an equally effective alternative that the composer did not consider. Another way tablebases cook studies is a change in the evaluation of an endgame. For instance, the endgame with a queen and bishop versus two rooks was thought to be a draw, but tablebases proved it to be a win for the queen and bishop, so almost all studies based on this endgame are unsound.[47]
For example, Erik Pogosyants composed the study at right, with White to play and win. His intended main line was 1. Ne3 Rxh2 2. O-O-O mate! A tablebase discovered that 1. h4 also wins for White in 33 moves, even though Black can capture the pawn (which is not the best move – in case of capturing the pawn black loses in 21 moves, while Kh1-g2 loses in 32 moves). Coincidentally, the tablebase does not recognize the composer's solution because it includes castling.[48]
While tablebases have cooked some studies, they have assisted in the creation of other studies. Composers can search tablebases for interesting positions, such as zugzwang, using a method called data mining. For all three- to five-piece endgames and pawnless six-piece endgames, a complete list of mutual zugzwangs has been tabulated and published.[49][50][51]
There has been some controversy whether to allow endgame studies composed with tablebase assistance into composing tourneys. In 2003, the endgame composer and expert John Roycroft summarized the debate:
"[N]ot only do opinions diverge widely, but they are frequently adhered to strongly, even vehemently: at one extreme is the view that since we can never be certain that a computer has been used it is pointless to attempt a distinction, so we should simply evaluate a 'study' on its content, without reference to its origins; at the other extreme is the view that using a 'mouse' to lift an interesting position from a ready-made computer-generated list is in no sense composing, so we should outlaw every such position."[52]
Roycroft himself agrees with the latter approach. He continues, "One thing alone is clear to us: the distinction between classical composing and computer composing should be preserved for as long as possible: if there is a name associated with a study diagram that name is a claim of authorship."[52]
Mark Dvoretsky, an International Master, chess trainer, and author, took a more permissive stance. He was commenting in 2006 on a study by Harold van der Heijden, published in 2001, which reached the position at right after three introductory moves. The drawing move for White is 4. Kb4!! (and not 4. Kb5), based on a mutual zugzwang that may occur three moves later.
Dvoretsky comments
"Here, we should touch on one delicate question. I am sure that this unique endgame position was discovered with the help of Thompson’s famous computer database. Is this a 'flaw,' diminishing the composer's achievement?
"Yes, the computer database is an instrument, available to anyone nowadays. Out of it, no doubt, we could probably extract yet more unique positions – there are some chess composers who do so regularly. The standard for evaluation here should be the result achieved. Thus: miracles, based upon complex computer analysis rather than on their content of sharp ideas, are probably of interest only to certain aesthetes."[53]
On the Bell Labs website, Ken Thompson maintains a link to some of his tablebase data. The headline reads, "Play chess with God."[54]
Regarding Stiller's long wins, Tim Krabbé struck a similar note:
"A grandmaster wouldn't be better at these endgames than someone who had learned chess yesterday. It's a sort of chess that has nothing to do with chess, a chess that we could never have imagined without computers. The Stiller moves are awesome, almost scary, because you know they are the truth, God's Algorithm – it's like being revealed the Meaning of Life, but you don't understand one word."[22]
Originally, an endgame tablebase was called an "endgame data base" or "endgame database". This name appeared in both EG and the ICCA Journal starting in the 1970s, and is sometimes used today. According to Haworth, the ICCA Journal first used the word "tablebase" in connection with chess endgames in 1995.[55] According to that source, a tablebase contains a complete set of information, but a database might lack some information.
Haworth prefers the term "Endgame Table", and has used it in the articles he has authored.[56] Roycroft has used the term "oracle database" throughout his magazine, EG.[57] Nonetheless, the mainstream chess community has adopted "endgame tablebase" as the most common name.
John Nunn has written three books based on detailed analysis of endgame tablebases: