Board representation (chess)
From Wikipedia, the free encyclopedia
In computer chess, software developers must choose a data structure to represent chess positions. Several data structures exist, collectively known as board representations.[1] Chess engines often utilize more than one board representation at different times, for efficiency.
Contents |
[edit] Requirements
A full description of a chess position, i.e. the position "state", should contain the following elements:
- The location of each piece on the board
- Whose turn it is to move
- Status of the 50-move draw rule. The name of this is sometimes a bit confusing, as it is 50 moves by each player, and therefore 100 half-moves, or ply. For example, if the previous 80 half-moves passed without a capture or a pawn move, the fifty-move rule will kick in after another twenty half-moves.
- Whether either player is permanently disqualified to castle, both kingside and queenside.
- If an en passant capture is possible.
Board representation typically does not include the status of the threefold repetition draw rule. To determine this rule, a nearly complete history of the game needs to be maintained, and so, is generally tracked in separate data structures.
[edit] Types
[edit] Piece lists
Some of the very earliest chess programs were working with extremely limited amounts of memory, such that even the 64 memory locations required to represent a chess board was too much to spend. These early programs would instead maintain lists of the locations of the up to 16 black and white pieces. Piece lists are still used by many of today's programs in conjunction with a separate board representation structure to speed up access to the pieces. Instead of looping through 64 (or more) squares to find all of the pieces, piece lists give instant access to the pieces.
[edit] Array based
One of the simplest ways to represent a board is to create an 8x8 two-dimensional array (or, equivalently, a 64 element one-dimensional array). Each array element would identify what piece or pawn occupied the given square, or alternatively, if the square is empty. A common encoding is to consider 0 as empty, positive as white, and negative as black, e.g., white pawn +1, black pawn -1, white knight +2, black knight -2, white bishop +3, and so on.
A problem with this approach arises during move generation. Each move has to be checked to ensure it is on the board, significantly slowing down the process.
One solution is to use a 12x12 array instead, with the outer edges filled with, say, the value 99. During move generation, the operation to check for a piece on the destination square will also indicate whether the destination square is off the board.
Better memory usage can be achieved with a 10x12 array, which provides the same functionalities as a 12x12 one by overlapping the leftmost and rightmost edge files (which are marked as off-the board). Some chess engines use 16x16 arrays to improve the speed of the rank and file number conversion and allow some special coding tricks for attacks etc.
Machine code programmers have another alternative. Using 4 bits per square, an entire row can be represented in 32 bytes, and the board in 8 registers (with an additional one for remaining position information). By use of a jump table, adding the piece value to the program counter can go directly to the code to generate moves for this type of piece on this square. Although the program is longer than for a conventional move generation methods, no checks for the edge of the board are required, and no moves off the board are considered, increasing move generation speed.
[edit] 0x88 method
The 0x88 method takes advantage of the fact that a chessboard's 8x8 dimensions are an even power of two. The board uses an array of size 16x8 = 128, numbered 0 to 127. It is basically two boards next to each other, the actual board on the left. The binary layout of the rank and file of a legal board coordinate is [0 r r r 0 f f f]. When generating moves from the main board, one can check that a destination square is on the main board simply by ANDing the square number with hexadecimal 0x88 (binary [1 0 0 0 1 0 0 0]). A non-zero result indicates that the square is off the main board. In addition, the difference between two squares' coordinates uniquely determines whether those two squares are along the same row, column, or diagonal (a common query used for determining check).[2]
[edit] Bitboard
One commonly used board representation is the bitboard. A bitboard is a 64-bit sequence of bits (0 or 1), which indicates the absence or presence (false or true) of some state about each place on the board. A board position can then be represented using a series of bitboards. For example, a series of bitboards for each piece type or pawn, for each side, can represent the board position.
The advantage to this representation is the ability to use bit parallel operations upon the 64-bit entities instead of iteration to manipulate and derive information about the state of the board. This makes maximal use of the hardware available, especially as once exotic 64-bit processors become mainstream.
[edit] Huffman Encodings
Inspired by Huffman coding, chess positions can be represented with bit patterns in such a way that more common board elements (empty squares and pawns) are stored with less bits than less common board elements:
Empty square = 0 Pawn = 10c Bishop = 1100c Knight = 1101c Rook = 1110c Queen = 11110c King = 11111c Where c is a bit representing the color of the piece (1 = LIGHT, 0 = DARK). Additional bits are needed for: 50-move rule (7 bits) en-passant column (4 bits) color to move (1 bit) castling rights (4 bits).
Huffman encoding schemes allow a complete board state (starting position) to be represented in just 23 bytes.
Huffman encodings are fairly CPU intensive, in comparison to other board representations which seek to minimize required processor and memory cycles. However, the small size of the final representation makes this approach well suited to storage of long term knowledge, for instance in storing positions in an opening book, where minimizing the board representation's size is more important than minimizing CPU cycles. Huffman encoded boards are also sometimes used in transposition tables for shallow entries.
[edit] Forsyth-Edwards Notation (FEN)
FEN notation is a board representation that was popularized before the advent of computers and is a human-readable method of recording a chess game position in a single line of text. FEN is still used frequently by chess programs for saving board positions to external storage, whenever a human may view that storage.
[edit] Examples
Here is the FEN for the starting position:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
Here is the FEN after the move 1. e4:
rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1
And then after 1. ... c5:
rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq c6 0 2