Talk:A* search algorithm

From Wikipedia, the free encyclopedia

Contents

[edit] Article title

Someone moved this from A Star Search algorithm, but it should be located at A Star search algorithm since "Star" is part of the title. It is usually written A*, but pronounced like the title of the article. It is "A Star," not "A star" like "A square" or "A circle." It is a specific algorithm—using lowercase makes it sound like it describes a generic sort of algorithm. Anyone else have any input on this? -Frecklefoot

I think A-Star search algorithm is about right. Better would be A* search algorithm, but apparently that's not an option?

Bart Massey

A* search algorithm as a topic works fine, as you can see. :) --ZeroOne 21:55, 17 Nov 2004 (UTC)

[edit] Bogus link

I removed the link to B* search algorithm. I determined it was bogus--it failed the Google test. —Frecklefoot 15:32, 12 Dec 2003 (UTC)

Well, it's referenced by the article on Maven (Scrabble), which supposedly uses that search algorithm. An article on how that algorithm works would be enlightening. RSpeer 08:06, Dec 25, 2004 (UTC)
B* is mentioned in (the classic) Artificial_Intelligence:_A_Modern_Approach, in the GamePlaying chapter, if I recall as an alternative to MinMax-alpha-beta.
The fact is, you shouldn't use the Google test on AI terms. When you're just doing AI-related stuff, you don't tend to classify exactly what algorithms or ideas you're using using standardized terminology (unless it's a popular buzzword and you want funding); the only place that so much precision is used is in textbooks, which you can't Google. RSpeer 17:53, Dec 26, 2004 (UTC)

[edit] Admissible heuristics

A bit of pervasive misinformation about A* search is that, as long as the heuristic never overestimates the distance to the goal, then the search is admissible. This is wrong.

As a counterexample: make a graph that you want to search, with a start and goal node, and give it an inadmissible heuristic, one that overestimates the distance somewhere so that A* chooses the wrong path. For the purpose of this counterexample, this heuristic shouldn't have any values over 1000000.

Now extend this graph by adding a new goal node, connecting it to the old goal node with an edge with a cost of 1000000. Suddenly the heuristic isn't an overestimate anywhere; if you're anywhere but the goal, it will cost at least 1000000 to get to the goal, and the heuristic is lower than that everywhere. So by the commonly-stated requirement for A* search, it should find the right path - but it doesn't. It finds the same path it found before, plus the new edge.

The requirement is that the heuristic can't overestimate the distance to anywhere; that is, if there's a path from A to B, then h(A) - h(B) is not greater than the shortest-path distance from A to B.

Course web pages, teaching assistants, and even textbooks leave this part out and make a provably false statement. Should I expand on this more in the article? RSpeer 08:06, Dec 25, 2004 (UTC)

Hmm. I've looked up a little more, and found a possible reason why the second condition is often left out; you can use a less efficient version of A* that always finds the correct path without that condition. However, the usually-stated algorithm requires both conditions. These notes describe both algorithms, and call the second condition "monotonic". RSpeer 18:04, Dec 26, 2004 (UTC)

Blah. I screwed up. The page was describing that less-efficient version all along. I moved the thing about monotonic heuristics to its own section. Sorry. RSpeer 19:23, Dec 27, 2004 (UTC)

[edit] Intuition on why A* is admissible and optimal

I added this section because it reflects exactly how I initially recognized that the search algorithm proposed by Nils Nilsson and Bert Raphael was "the best we'll ever find".

See the original A* paper (which incidentally we had a hard time publishing, leading journals of the day were pleased to reject it as trivial or of no interest):

Hart, P. E., N. J. Nilsson, and B. Raphael, "A Formal Basis for the Heuristic Determination of Minimum Cost Paths in Graphs," IEEE Trans. on Systems Science and Cybernetics, Vol. SSC-4, No. 2, pp 100-107, (July 1968)

It seems to me that this intuitive view is still of interest, I hope that it's more than a purely historical curiousity.

[edit] Difficult to extend heuristic

The intuition section of the article claims that A* does not take into account non-crossing and spatially close nodes, and that such considerations would be difficult or expensive to encode into the heuristic. Why so?! In my latest version distance to the neighbour node is one aspect of the heuristic and spatially close nodes will therefore tend to be prioritised. It was both trivial and cheap to implement. However, the question id if this is preferable, especially in non-symmetric spaces, where one long jump can be cheaper than several short. Mikademus 07:56, 16 September 2006 (UTC)

[edit] Missing details

1. Description on path reconstruction from closed priority queue: The article claims that the path can be reconstructed from the closed priority queue, without storing the path so far at each node. The article directs the reader to "see below" for a description of the advertised technique, but that description is nowhere to be found.

2. Check candidate nodes to see if they're already in Closed AND Open: The article mentions that you must check if a candidate node is already in the closed priority queue. I have read other descriptions such as Amit Patel's which is linked at the bottom of the page which states that you should also check if the candidate node is already in the open priority queue. Experimentally I determined that the algorithm runs at unacceptable speeds unless I perform this additional check.

Since I am in no position of authority on A* I have not made any edits to the article. Hopefully someone else can address these two issues.

[edit] Monotonicity

The "open" and "closed" bookkeeping is not necessary...

Would someone add an explanation of why this is the case, i.e. how an algorithm might be implemented for a monotonic heuristic? This document says that there is no need for a closed list, but the article implies that the open list isn't necessary either. It's not obvious how the backtracking would work. Thanks.   — Lee J Haywood 19:13, 28 August 2005 (UTC)

You don't really backtrack. Every time you reach a node, you have found the shortest path to that node. So, you just want to make sure you don't revisit nodes. I guess it would make sense to consider the mechanism that does that to be an "open list", even though there's no "closed list" to contrast it with. I'll rephrase that. RSpeer 03:20, August 29, 2005 (UTC)

Only one list of visited nodes needs to be maintained, so that nodes are not unnecessarily revisited.

Doesn't this wording seem to imply a closed list rather than an open list? Also, when I said 'backtracking' I was referring to the situation where a search hits a dead-end (where the remaining distance is low but there is no valid path) and the choice has to be made as to which open list entry to expand next... I could really do with an example of a valid, monotonic heuristic in action to understand what's going on. Thanks.   — Lee J Haywood 08:48, 29 August 2005 (UTC)

[edit] how does it work

Right now, the article doesn't explain how the algorithm works in detail.

Also, it'll be great to explain why it's called "A*". --Abdull 15:47, 27 November 2005 (UTC)

In response to this question about why it's called A*:
The notation is borrowed from the statistical literature. Statisticians use a hat (also called a circumflex) to indicate an estimate for a quantity, and often use a star to indicate an estimate that's optimal with respect to a stated criterion (like, say, a minimum variance criterion). When I (Peter E. Hart) was developing this algorithm and especially the theory behind it with my colleagues Nils Nilsson and Bertram Raphael, we adopted this traditional notation. We used hats to indicate estimates of heuristic functions, and went on to compare any other algorithm, call it A, with our provably-optimal (by the criterion of number of nodes expanded) A*. Hart 02:16, 7 March 2006 (UTC)

I agree. Also it would be nice to know who first developed it. --Kubiak 19:30, 29 November 2005 (UTC)


Hm, I hope the new article is a bit better in this regard :-) Regnaron 18:59, 15 December 2005 (UTC)

[edit] Rewriting of article

Hi, I just translated my article for the A* search algorithm I wrote for the german wikipedia (which was once founded on the old version of this article) into english, and uploaded it here. But since my mothertounge is german, I am sure there are not just dozens of typos, but also tons of wordings where native speakers just want to run away screeming :-) So I would be very pleased if some of you could correct my choice of words :-) Regnaron 18:58, 15 December 2005 (UTC)

I think your translated article is very comprehensive, if a bit overly "textbooky" for Wikipedia. (Have you considered Wikibooks?) However, it's full of typos, grammatical atrocities, and just generally things that are wrong in English. It's very difficult to get any sense of what A* is or what it does simply by reading the article from start to finish, because nobody who starts reading will finish! So I'm reverting to the old English article — but with an encouraging word for any native English speaker who wants to try to translate the German article from scratch. (But please, less wordy!) --Quuxplusone 02:50, 16 December 2005 (UTC)
Hi! Hm, well, in fact I tried to make the article somewhat comprehensive through the big example section and explaining how it works, and I still would say this can help the reader to grasp the idea of "how an A* search works". (Nevertheless, all aspects of the algorithm had been covered in the article) But as I stated in my initial post here: I already assumed that there is (or now was :-)) a lot of wording that would simply be wrong in english. (I just did not expect it to be that much *g*) Getting to the point: I belive you that your comments and critics are correct, so I will not start yelling and hunting for you for reverting the article. Despite, in case anyone still wants to invest some time "translating" my old article: It will stay being acessible via my Playground on my userpage Regnaron 08:32, 16 December 2005 (UTC)

[edit] Relation to uniform-cost search

"Uniform-cost search, aka Dijkstra's algorithm, is a special case of A*..." - even a quick glance at the linked articles shows that "Dijkstra's algorithm" is not the same thing as "uniform-cost search". That said, I'm not knowledgeable enough on this topic to feel comfortable rewriting the sentence. Rizome 20:07, 8 May 2006 (UTC)

Hi! Hm, if I have a look at what is written at Uniform-cost search (UCS) it quite fits the description of Dijkstras Algorithm. If you just replace root with source you will get a description of Dijkstras Algorithm. I am not absolutely certain why there are two Pages describing the same algorithm, but at the current state of the UCS article it (in a very simplified way) does describe Dijkstras Algorithm. (Always expand the globally cheapest node) Regnaron 18:48, 22 May 2006 (UTC)

Ditto, as far as I can see Dijkstra is the same thing as uniform cost-search. I also have trouble seeing how Dijkstra is a special case of A* considering A* came after Dijkstra published his algorithm, the fact that the earlier can be seen as a derivation of the latter doesn't mean that it is a special case of A*.

Dijkstra's is a special case in the mathematical sense, this has nothing to do with historical derivation. Qwertyus 14:10, 26 May 2006 (UTC)
Well, use A* with the heuristics h(x) = 0 for all x. (meaning using no heuristics at all)
What you will get is just Dijkstras Algorithm. So in this sense Dijkstra is a special case of A* (namely one where the heuristcs is 0 for all nodes) Regnaron 21:14, 26 May 2006 (UTC)

[edit] Intuition about why A* is not computationally optimal

This section points out that A* may not be computationally optimal for non-planar graphs.

The proof of the optimality of A* relies on the fundamental assumption that the triangle inequality is satisfied for every 3-node subset of the graph being searched. A non-planar graph may violate this assumption.


Huw: Actually the way I read this section it states that A * is not computationally optimal for planar graphs because A * does not take advantage of certain features of planar graphs. I claim that this is nonsense. I claim that A * depends for its quality entirely on the heuristic being used. For example, if :h(x) = h*(x)  \forall x then A * will walk straight to the solution regardless of the type of graph being examined. Typically, for planar graphs, h(x) is defined to be the linear planar distance from x to the goal. This makes A * take advantage of precisely the planar features that the author claims it does not. This means that A * will be computationally optimal for these graphs.

Matt: The previous comments are correct. This section needs to be deleted, or at least retitled and heavily edited! The optimality of A* is a fundamental principle of search algorithms. Every A.I. textbook teaches it, and any debate should be left to researchers in the field. Explaining why it doesn't make sense to you should not be posted in the main article, bring it up in discussion and we can clear up any misunderstandings. People get confused because: a) they think 'optimal' means 'fast'. b) they forget that A*'s optimality proof assumes the heuristic is admissable and consistent.

[edit] complexity

polynomial... what polynomial, n^2 is mutch mutch better than n^3

I would doubt the correctness of that paragraph as a whole... First of all I do not see any exponential running time of the algorithm. It has to visit each vertex in a graph exactly once, as well as it has to consider every edge once. That is linear in the number of vertices and edges. You get some overhead from maintaining the priority queue, but still nothing exponential. Furthermore: If you have a graph that is a list, you can have the best heuristic, but still will do no better than with no heuristic at all, since you have to go the complete path up to the goal node. E.g. s \rightarrow a_1 \rightarrow a_2 \rightarrow \dots \rightarrow a_n \rightarrow g where s is the start node and g the goal node. Here you may have a perfect heuristic, but the running time does not change at all since you cannot skip any nodes. Regnaron 17:24, 14 July 2006 (UTC)

Huw: I agree. This paragraph is simply not true. I'm not even sure what O(log(h * (x)) even means in this context since x represents a vertex in a graph and O(f(n)) should imply that f is the dominant term in the cost of an algorithm when n\rightarrow\infty.

O(log(h * (x)) is the asymptotic growth of the perfect heuristic h * . I explained this and inserted a reference. Btw., visiting each edge and each node once is only possible for finite graphs. A* can be used for infinite graphs, or at least graphs that do not fit in any computer memory using an "explicit" representation. Qwertyus 00:44, 21 July 2006 (UTC)

[edit] The Pseudo-code

I'm not satisfied with the pseduo-code [it is atm]; in attempting to stay close to actual programming it is in fact more confusing than helpful. I'd suggest something more talkative and close to natural-language, like my take below. I'd appreciate any opinions or alternative takes, because I think we do need another pseudo-code.

AStarSearch( start_node, goal_node ) 

  queue open_list
  queue closed_list

  push start_node on open_list

  while open_list not empty

    let node be RemoveLowestCostFrom( open_list )

    if node is goal_node 
      then return SUCCESS

    else

      push node on closed_list

      foreach neighbor that is adjancent_to( node )
        if neighbor is not in open_list
          then if neighbor is not in closed_list
            then
              EstimateCostFor( neighbor )
              push neighbor on open_list

return FAILURE

Mikademus 08:07, 11 August 2006 (UTC)

This version is clearer, but:
  • It returns success or failure, rather than a path. In fact, it doesn't remember paths, unless they're stored in nodes, but then the if neighbor is not in open_list is redundant.
  • The function EstimateCostFor should be called implicitly by the push function.
  • closed_list is not a queue, but a set.
  • The goal node need not be explicitly given. A* is used for other things than simple pathfinding (such as natural language parsing), where the goal is not even known beforehand.
My suggestion:
procedure AStar(start_node)

  var open_list   := empty queue
  var closed_list := empty set

  var initial_path := a partial path containing only start_node

  push initial_path on open_list

  while open_list is not empty

    let partial_path be RemoveItemWithLowestCost(open_list)

    if partial_path reaches a goal node
      return partial_path

    else if the last node of partial_node is not in closed_list
      add this node to closed_list

      foreach neighbor of the last node in partial_path
        var new_path := partial_path extended with neighbor
        push new_path on open_list

return FAILURE
Qwertyus 13:00, 11 August 2006 (UTC)

I agree with the points you provided above. I based the style of the pseudocode on that in the "AI for Game Developers" book by Bourgh & Seamann, in which their conceptual outline does not complicate the code with creating a path, only showing how to find the goal node. However, it may be even more confusing to omit the creation of the path itself, so my follow-up suggestion, with some improvements(?)

  • Conceptually easier to follow path generation
  • Providing a goal node for pedagogical reasons
  • The bizarre pleasure of beholding an unholy pseudo-language which is consistent and looks like a mixture of Pascal, Python and Visual Basic
function AStar( Node start_node, Node end_node) returns List of Node

  type Visited_Node := record of 
    node := Node
    parent := Node
  end record

  var open_list := empty Queue of Visited_Node
  var closed_list := empty Set of Visited_Node

  var node_info := Visited_Node( start_node, empty );

  push( what:node_info, on:open_list )

  while open_list is not empty

    var node_info := remove_first( from:open_list )
    push( what:node_info. on:closed_list )

    // The goal node was reached. Trace the path using
    // the saved parent information and return path
    if node_info.node equals end_node
      then return trace_path( from:node_info, in:closed_list )

    else
      foreach neighbor of node_info.node

        if neighbor is not in open_list
          and if neighbor is not in closed_list
            then
              let node_cost := EstimateCostFor( neighbor )
              var new_node := Visited_Node( neighbor, node_info.node )
              push( what:new_node, cost:node_cost, on:open_list )

// Pathfinding failed. Return an empty
// list to indicate this. 
return empty

Mikademus 15:41, 11 August 2006 (UTC)

I think this is a bit long and specific, especially with regard to the Visited_Node record type. IMHO, the pseudocode is only there for illustration and reference, and the algorithm should be explained in the text.
Your pseudolanguage is very, ehm, interesting, indeed. It's quite consistent, but I don't regard that a major quality in pseudocode: the whole point of it is being able to mix programming and natural languages.
Also, I don't understand why you'd want to check whether neighbors are in open_list. Qwertyus 17:44, 11 August 2006 (UTC)
Hi! Well, here a third suggestion :-) (copied over from german wikipedia with quick and dirty translation of the comments)
/*
 * algorithm reconstructing a shortest path
 * ----------------------------------------
 * P: computed shortest path
 * start: starting node
 * node: goal node
 */
reconstruct_shortest_path (n, p)
{
   // Check if start node reached
   while (π(n) != NIL))
   {
      // add node to path
      push(n, p);
      // do the same for the predecessor
      n := π(n); 
   }
   // return path found
   return p; 
}
/*
 * Initializing graph
 * ------------------
 * G: graph to be initialized
 * s: start node
 */
initialize(G, s)
{
   // set distance to all nodes to infinity
   forall v do
      d[v] := ∞
   // set distance to start node to zero
   d[s] := 0;
}
/*
 * relaxin an edge
 * ---------------
 * u: current node being explored
 * v: currently considered sucessor of u
 * w: weight function for computing weight of edge
 * f: combination of distance and heuristic (f[v] = d[v] + h[v])
 */
relax(u,v,w,h)
{
   // check if going through  u results in shorter path
   if f[v] > d[u] + w(u,v) + h[v] {
      // new path is shorter
      // update estimated path lenght to goal
      f[v] := d[u] + w(u,v) + h[v];
      // update predecessor pointer for shortest path
      π[v] := u;
   }
   else
      ; // no change since already shorter path known
}
/*
 * A*-algorithm
 * ------------------------------
 * G: graph to run algorithm on
 * s: start node
 * g: goal node
 * w: weight function for computing weight of edge
 * f: combination of distance and heuristic (f[v] = d[v] + h[v])
 * Q: priority Queue for nodes of graph
 */
A-Star (s, g, w, h, G) {
01      initialize(G, s);
02      Q := V[G]; // add all nodes to priority queue
03      while not isEmpty(Q) {
04         // examine node with least distance to start node
05         u := pop(Q);
06        if (u == g) then
07           return reconstructShortestPath(g);
08        else {
09            // examine all neighbours of current node u
10            forall v := sucessors(u) do {
11               // relax edge between u and its sucessor v
12               relax(u, v, w);
13            }
14         }
15      }
15      // no path found
16      return fail;
17 }

So, why this algorithm? Well, first of all I like building extra functions for relax and initialize because it make the algorithm itself cleaner. How the graph is initialized does not have anything to do with how A* works. Furthermore, this makes it *very* similar to Dijkstra, in the way that you just have to adjust the relax function to cope with the heuristic. Furthermore it spares you the question of "should the path generation be in the algorithm or not". It is not directely in the algorithm, but one can find it hidden in the extra relax functionality. The last thing is the part of open/closed lists. If you have a monotone heuristic (which should not be too hard to find in many domains) you do not need to do that extra bit of bookkeeping. So leaving out this part and just assuming a monotone heuristic makes the algorithm smaller. Furthermore I do not know if the algorithm should be too informal, since the code still is code, and maybe should be explained in natural language that accompanies the code. I mean: Regardless how informal you write the code, you still have to expain it. So why not one brief (quite formal) overview and an explanation in natural language english? Regnaron 19:41, 12 August 2006 (UTC)

IMHO, this version:
  • describes everything it does twice, once in code, once in comments, entirely defeating the purpose of pseudocode;
  • doesn't translate easily to applicative programming languages, due to the use of destructive assignment;
  • can only be understood properly if one knows Dijkstra, which is unnecessary for explaining A*.
Qwertyus 20:29, 12 August 2006 (UTC)


Hi!
Hm, ok, if you understand the code, it does describe things twice, right. But unless there is some explanation in the text I think I would not know instantly what for example the line if f[v] > d[u] + w(u,v) + h[v] does. But if there is some prose explaining the code, you are right that one can just get rid of the comments. :-) The second point is translating into applicative programming languages. Here I do not really understand what you mean by this? For example: Q is a priority queue, and those might have a pop functionality which returns the element on top and removes it from the queue. And this element then is assigned to the variable u. So I think you can quite easily modify this code into java or whatever since it already is more formal. Or did I get you wrong? Third point: I rather think it's the other way round: Dijkstra and A* are (if you let dijkstra stop as soon as it finds a goal) equal. If you expain one of them, you also explained the other one. So if you start from scratch (which an article about an algorithm should IMHO do), explaining dijkstra or explaining A* yields in more or less the same article, simply because (that above mentioned version that just searches for one goal) of dijkstra is nothing more than A* with the heuristic h[n] = 0 for all nodes. Regnaron 08:32, 13 August 2006 (UTC)


Hmm, I'm actually with both you here since there are different purposes of pseudo-code and implementation code: the pseudo-code should provide a bare-bones conceptual overview of the algorithm with all complications peeled away. THe "real" code of course hands out a working implementation. Therefore, I'd suggest that we, similar to the Bellman-Ford pathfinding article, show pseudo-code early in the article and insert an "implementation" section with real code, f.i. Reganon's version above, last in the article. Mikademus 07:41, 13 August 2006 (UTC)

But is this Idea of peeling away the complications not already done in that code? To me it reads as follows:
  • Initialize graph. How? -> There is a magic function that does that. What is "initializing a graph"? -> See prose
  • Add all nodes to priority Queue (ok, here the notation indeed might be a bit akwards) Once more: How? -> Just do it.
  • Take first node of queue
  • check if goal found. If yes, return path (again: How? There is a function that does that.)
  • Look at all neighbours of current node and relax them. What is relax? -> Has been explained in text
  • If no nodes to expand and goal not found => no path to goal
So here the "hard" parts are in extra functions that in principle do not need to be understand or can be explained in a "high level" in prose.
Or do you more aim for something really close to natural language as the above with as few code as possible. Why then not really a list of natural language items that describe the algorithm in english? Like:
  • Initialize graph
    • Set distances for all nodes to infinity
    • Set all predecessor pointers to NIL
    • Set distance of start node to zero
  • Insert all Vertices of graph into priority queue and queue them by their distances.
...

Regnaron 08:30, 13 August 2006 (UTC)

The code is somewhat simplified, but as an example, take the pseudocode from the "AI for Game Developers" (Bourg, David & Seemann, Glenn; 2004; O'Reilly Media Inc., Sebastopol) book:

add the starting node to the open list
while the open list is not empty
{
  current node = node from the open list with the lowest cost
  if current node = goal node then
    path complete
  else
    move current node to the closed list
    examine each node adjacent to the current node
    for each adjacent node
      if it isn't on the open list
        and isn't on the closed list
          and it isn't an obstacle
            move it to the open list and calculate cost
}

(p. 129) It traces out the pathfinding algorithm only, avoiding any "superfluous" details (like storing the cost or constructing the return path). The bad thing is of course that it isn't directly usable, but the advantage of not immediatly jumping into the gritty implementation details is that it gives the reader the conceptual overview necessary to not be overwhelmed and to gain a structural understanding in which to fit in the discussion and details. Admittably, my two sugestions above fail in this since they lean to much to "functional" implementations. Mikademus 09:20, 13 August 2006 (UTC)