Talk:Alpha-beta pruning

From Wikipedia, the free encyclopedia

Contents

[edit] Query about AB pruning description

In the description for this article, this appears: "If the move ordering for the search is optimal (meaning the best moves always searched first)"

As far as I can tell, the optimal search order for AB pruning is worst-first. The algorithm is looking to eliminate a subtree as early as possible, and the way to do this is to find a move in it that's worse than the current Alpha value; worst-first gives the best chance of doing this. Am I missing something? I'll update the description if I don't hear otherwise. --=== Jez === 11:48, 28 May 2007 (UTC)

No, it's best-first. Though it's best-first for the player whose move is under consideration *at that level of the tree*, which obviously alternates each move. The way it works is this: when trying to decide on a move for player A, consider the first move (ideally this is the best move, but not always -- if the move ordering was perfect there wouldn't be a need for the search). Arrive at a score for the first move. Then go the second move. The idea is to determine as quickly as possible that this move is worse for A than the first move. The quickest way to do this is to consider B's best response to that move. Now consider trying it in the reverse order. Suppose the first move tried is worse than the second. Then, when searching the second, we never encounter a move by B that proves it worse than the first move, so we have to consider all of B's possible responses instead of just one. HTH, let me know if it doesn't make sense. Evand 23:36, 28 May 2007 (UTC)
Ahh, I understand now. You're calling it B's best move, I was thinking of it as A's worst scenario, as it's usually A that's doing the search. At least my diagram is still correct. I think I'm right in thinking that AB pruning can only happen on a minimizer's move? --=== Jez === 10:15, 29 May 2007 (UTC)
When it extends more than two plies deep it gets a little harder to follow :) For example, when A is considering the first move, there are no established bounds, so the algorithm ends up the same as if B was searching for their best possible move from a position one ply deeper. (This is part of why it makes sense to consider the algorithm as symmetric, with the player doing the searching switching each ply.) The effect is actually the same it the alpha-beta window isn't tight enough to cause a cutoff on later moves A tries -- that is, if those moves are actually better than the first move, or if B hasn't yet found a good enough response to prove they aren't. BTW, for your diagram, it might make it clearer what's happening if some of the moves that got cut off were better than the alternatives that were searched. Evand 20:02, 29 May 2007 (UTC)

[edit] Incorrect

The article is incorrect. Alpha beta pruning pertains to TWO types of pruning whereof the author only describes one.

I think all search strategies except minimax and alpha beta should be put in a single page - Arvindn

I respectfully disagree - search is an interesting problem, and there's more than enough info on each algorithm for a page each. Besides, what would the page be called? "All search algorithms (other than minimax and alpha beta)"? Surely A* at least deserves its own page? Sorry. GeorgeBills 15:48, 14 Jun 2005 (UTC)

I put ordering heuristics here because they derive their power only from their interaction with alpha-beta search -- they have no advantage otherwise. The Anome


The article claims: The optimisation typically reduces the effective branching factor by two compared to simple Minimax, or equivalently doubles the number of ply that can be searched in a given time. Since the number of nodes to be searched (i.e. time required) increases exponentially with each ply, I don't think this is correct. Here's an example, with math... take a tree which had an effective branching factor of 6 before applying alpha-beta pruning... Assuming I had time to search 4 plys deep previous to pruning, I would have had time for 6^1+6^2+6^3+6^4=1554 nodes. If alpha-beta reduces the branching factor to 3 now, the article's claim is that I should have time to search 8 plys deep. However, that would be 3^1+3^2+3^3+...+3^8=9840 nodes. Clearly, reducing the effective branching factor by two does not buy one the time to search twice as many plies deep. Have I made an error in this reasoning? Unless I hear otherwise, I'll update the article. --Ds13 22:31, 19 January 2006 (UTC)

Hmm. Using the formula for the sum of a geometric progression with m=1 and taking x and y as the two bases:
\frac{x-x^{n+1}}{1-x} = \frac{y-y^{m+1}}{1-y}
So what is x in terms of y, if x < y? What speedup can you recieve i.e. what is n in terms of m? r3m0t talk 07:39, 20 January 2006 (UTC)
Yes, you are correct that the current description is in error. Alpha-beta pruning reduces the number of searches required under the minimax algorithm by the square-root of what the total number would otherwise be. In effect, this allows exactly 1-ply deeper (using chess terminology) to be completed, without error or loss of important information, within the same time. --AceVentura
Actually, it allows you to search twice as many plies (assuming perfect ordering, of course) -- w ^ d = sqrt(w ^ (2 * d)). Evand 06:30, 22 January 2006 (UTC)
I think the implicit assumptions are that only end nodes are evaluated and that evaluation takes much longer than move generation. Stephen B Streater 14:15, 23 June 2006 (UTC)
With an optimal ordering of the moves the complexity can indeed be reduced to O(bd / 2) - but I do find it noteworthy that optimal ordering is impossible (otherwise this order could be used for decision making without the need for any minimaxing...). AFAIK Moore and Knuth first examined alpha-beta pruning in 1975, concluding an average asymptotic complexity of O(b3d / 4) for small b, converging to O((b / logb)d) for b > 1000. Maebert 2:22, 7 December 2006 (UTC)

The pseudo-code has been changed back-and-forth several times (see the *** below), so perhaps it's better to add some discussion here rather than to add to the article itself.

 evaluate (node, alpha, beta)
   if node is a leaf
       return the heuristic value of node
   if node is a minimizing node
       for each child of node
           beta = min (beta, evaluate (child, alpha, beta))
           if beta <= alpha
               return *** alpha or beta? ***
       return beta
   if node is a maximizing node
       for each child of node
           alpha = max (alpha, evaluate (child, alpha, beta))
           if beta <= alpha
               return *** alpha or beta? ***
       return alpha

The pseudocode is correct, regardless of whether alpha or beta is returned at the cut-offs. In either case, the parent node knows (implicitly) that a cut-off occurred, and that the cut-off node is equal to or worse than the current best child node, and that therefore it must retain the current best child node. —Unknown

Wait... how does the parent "implicitly" know that a cut-off occured? It could be returning a cut-off, or it could be returning an actual known value. I agree that it does not matter whether alpha or beta are returned in the cut-off case, but I think it is important to either a) return an additional property which says "equal to or worse" (as opposed to just equal), OR b) make it clear that the parent must favour earlier results to later ones (that is, if one node returns "2" and another node also returns "2", this second return "implicitly" means "2 or worse", while the first one actually means "2", so it must favour the first one). —EatMyShortz 17:27, 21 November 2006 (UTC)
I agree, this pseudo-code seems to be correct. I've been able to implement it in an unbeatable tic-tac-toe player. However, I've not been able to verify the current condensed pseudo-code in the actual article. Ceran 01:38, 13 January 2007 (UTC)
The difference between the two examples only matters when you have modified to algorithm to return additional information (like a move index) along with the score. However, this is almost always unnecessary. A typical implementation will use a simple numeric alpha beta search, where the search function only returns numbers. The top level function will call this search on each possible move from the current state, remember the highest score it has seen and apply the associated move index. So, in the actual low level search function, it does not matter whether you return alpha or beta as either one will be ignored.

[edit] Pseudo code is broken

The pseudo code is *WRONG* in the same way as negamax (which is identical, clearly something is wrong): the heurestic evaluation must be negated whenever player 2 is to play. Also, if the root call was for player 2, the final result must be negated again. —The preceding unsigned comment was added by 208.65.183.75 (talk) 07:39, 4 March 2007 (UTC).

Not true, although the description should be clearer. The heuristic is always from the perspective of the player executing the search, not the current player.

The code in the Talk page appears to be an implementation of MiniMax with Alpha Beta pruning. The code in the actual Wiki page appears to be an implementation of NegaMax. The difference is that MiniMax alternates between taking minimums and maximums, while NegaMax negates Alpha and Beta and the returned result so as to keep using maximums. It's simpler but quite possibly harder to understand. -- wadetb at gmail dot com

[edit] Improvement to image

The tree image at the top is quite helpful (especially compared with earlier versions), but would it be possible to show the values of Alpha and Beta somewhere? In the MiniMax page there is a description to go along with the image that describes the processing at each step. Something similar would help quite a lot on the Alpha-beta page.

I agree. There should be something like an algorithm walkthrough.

[edit] pseudocode - added from negamax

it was quite beutyful and short ;) ---- ca1 84.16.123.194 (talk) 16:05, 3 April 2008 (UTC)