Talk:Prim's algorithm

From Wikipedia, the free encyclopedia

Contents

[edit] C++ code for Prim's using adjacency matrix A

A[i][j] is dstance from node i to node j. Sentinels NONE and INF are used to avoid complex logic. The code is written in "computer olympiad style", using static allocation over STL containers or malloc'd memory. Jettlogic 15:35, 11 October 2006 (UTC)

 #include <algorithm>
 using namespace std;
 #define MAXN 5
 const int NONE = MAXN;
 const int INF = 1000;
 void mst_prim(int N, int A[MAXN][MAXN], int parent[MAXN+1]) {
     bool intree[MAXN+1];
     int distance[MAXN+1];
     fill(distance, distance+MAXN+1, INF);
     fill(parent, parent+MAXN+1, NONE);
     fill(intree, intree+MAXN+1, false);
     int u = 0;
     while(u != NONE) {
         intree[u] = true;
         // next u is closest not-in-tree vertex
         int nu = NONE;
         for(int v = 0; v < N; v++)
             if(!intree[v]) {
                 if(A[u][v] < distance[v]) {
                     distance[v] = A[u][v];
                     parent[v] = u;
                 }
                 if(distance[v] < distance[nu])
                     nu = v;
             }
         u = nu;
     }
 }


[edit] Colored lines in graph

Are there red and green lines on this graph? I am colourblind and cannot tell. A more apropriate choice of colours would be red and blue.

  • JA: They all look black on my browser. I suggest solid and dashed. Jon Awbrey 23:10, 15 February 2006 (UTC)
    • I agree, or thick and thin lines would work too. They should also be SVGs. Deco 23:19, 15 February 2006 (UTC)
      • Hi, I made those graphs; sorry about the colours, I did not realise it could be problematic. I will try some other styles (solid and dashed sounds good) and add them here when they are done. Shen 14:54, 16 February 2006 (UTC)
        • Hi, thanks very much for taking the time to make the graphs and considering my suggested revisions

Ok, I made some new graphs:

The current image - no change
The current image - no change
Thicker and thinner lines
Thicker and thinner lines
Dotted and dashed lines
Dotted and dashed lines
Opacity changes
Opacity changes
Which type would work best? Shen 16:44, 17 February 2006 (UTC)
I made some new images using thinner and thicker lines, it should be clearer now. Shen 22:06, 19 February 2006 (UTC)
I'd urge the use of colours vs. different sorts of lines here, since "graph colouring" is a very common term in the literature of computer science, and is a de facto standard way of denoting groups of edges or nodes, in discussion even more than pictorially. Explaining that the colouring is represented by dashes and whatnot would be convoluted. However using colours accessible to the colourblind is always a good idea. Also Shen, I don't think the current choice of colours is a good one (though I am not colour blind and therefore cannot tell if they are differentiable). See this article. DaveWF 07:37, 14 February 2007 (UTC)

[edit] Suggestion of a better explanation

I propose to replace the item list in the beginning of the page with this:

  1. Choose a vertex of the graph G as initial tree E
  2. Create a set of all edges in G connecting a vertice of E to a vertice outside of E
  3. Out of this set, choose an edge with the lowest weight and add it with the respective vertice to E
  4. Repeat 2-3 until all vertices of G are in E

E is the minimum spanning tree.


I agree with this one. The step "remove from the set an edge with minimum weight that connects a vertex in the tree with a vertex not in the tree" is imho not O(log E). --Pukeye 09:04, 28 August 2006 (UTC)

It is if you use a heap.

[edit] Don't Forget!

I agree that a better pseudo algorithm is needed here I also think that you NEED to mention that a edge should only be used if it results in a cycle, which is obvious to some, but really should be spelt out.

Don't you mean if it *doesn't* result in a cycle? - Solberg 06:57, 7 July 2006 (UTC)Solberg
He must've. You only add an edge if it's of minimum cost and doesn't cause a cycle. DaveWF 07:42, 14 February 2007 (UTC)

[edit] Something isn't right

From Cormen et all, the psuedocode for this Algorithm is: (Used only for discussion, do not use in article)

MST-PRIM(G, w, r)
01  for each u ∈ V [G]
02    do key[u] ← ∞
03      π[u] ← NIL
04  key[r] ← 0
05  Q ← V [G]
06  while Q ≠ Ø
07    do u ← EXTRACT-MIN(Q)
08      for each v ∈ Adj[u]
09        do if v ∈ Q and w(u, v) < key[v]
10          then π[v] ← u
11            key[v] ← w(u, v)

From this it appears that the explanation above (and in the article) is wrong, also it appears to me that the graphs are wrong.

The important difference: For each iteration of the while loop (lines 6-11), the minimum accessible vertex (and not edge) is used.

For this vertex u, each adjacent vertex v which hasn't been explored is considered and if the 'cut' crossing to it from u is least found so far, u is set as parent to v and v's key is updated.

The algorithm's progress should be as follows (when starting from the article's graph):

When starting from a (in the graphs), both b and d should have been connected to a at the first iteration. The next iteration would have explored d as it is the vertex with the least key that is still in the queue.

From d, the exploration would have considered b but would not have connected them in the MST, because b's weight from d is larger than from a. Also e and f would have been connected initially from d, but later e might have been updated to connect from b when it is explored. --Opium 12:01, 5 September 2006 (UTC) Updated again Opium 08:40, 6 September 2006 (UTC)

  • I don't think that's right. From my understanding, Prim's works like this:
- Let T the subtree of G consisting solely of the starting node
- While there are nodes in G that are not in T:
  - Find the node in G but not in T which has the least edge connecting to a node in T
  - Add that node and the corresponding least edge to T
- T is then the minimum spanning tree


[edit] Corrections

I agree with the last poster directly above this ("I don't think that's right...").. I made some corrections to the algorithm description, updated the runtime costs section, and tied the pseudocode to the initial algorithm description. The approach presented is a reworking of the ideas presented in n the Levitin 'Design and Analysis of Algorithms' text. To me the pictorial description of the process is fine. Let me know what you think.. maybe we can remove the 'contradictions' tag? --Turketwh 05:06, 17 November 2006 (UTC)

No, I believe that is too simple. Opium's interpretation of EXTRACT-MIN(Q) seems to be incorrect: the function chooses the smallest edge between two vertices. To clarify things a little, this is how I learned Prim's algorithm:
  • Set the root of tree A as a random node
  • Array S of all edges
  • Array Q of 'unused vertices'
  • While Q is not empty:
    • Remove the smallest edge from S that connects a vertex a in A to a vertex q in Q
    • Remove q from Q
    • Add q to A, under vertex a
This is how I interpreted the section Prim's algorithm in Introduction to Algorithms (chapter 23, section 2, ISBN 0-262-53196-8). I suggest somebody that can explain this more intuitively edit the article accordingly. --Stimpy 16:08, 17 December 2006 (UTC)

[edit] pseudocode cleanup

I cleaned up the code a lot, all of the "u"s and "v"s and "key"s are very confusing and difficult to read. I hope my changes are acceptable. --gatoatigrado 22:27, 16 January 2007 (UTC)

Sorry, there was a very important mistake that I made. The average time complexity for updating the adjacent vertices, E/V, can be used. --gatoatigrado

[edit] edit to time complexity?

I don't know about the fibonnaci heap (and I honestly haven't implemented the adjacency matrix, but an implementation to achieve the desired time complexity seems relatively easy), so please edit that and then perhaps the description of the time complexity can be changed.

Minimum edge weight data structure Time complexity (total) TC removal of 1 vertex TC edge update (also for each vertex)
adjacency matrix, searching V^2 V E/V
binary heap (as in pseudocode below) and adjacency list O((V + E) log(V)) = E log(V) log(V) E/V * log(V)
Fibonacci heap and adjacency list E + V log(V) log(V) E/V

All of the algorithms have do |V|-1 iterations. The adjacency matrix implementation takes O(V) time to find the minimum, whereas the other implementations including the fibonnaci heap? take O(log(V)) time. As described above, all of vertices adjacent to the last vertex added are updated. There are, on average, E/V adjacent vertices. The adjacency matrix and fibonnaci heap ? take constant time to update the minimum distance, whereas the binary heap takes log(V) time. The Fibonacci heap is significantly faster when the graph is dense enough that E is Ω(Vlog V). --gatoatigrado

It's sort of correct, sort of not. First, it would be clearer if you listed the time to update a single edge, because it's possible that some vertices have many more than E/V edges to update and that others have many fewer. Yes, it averages out, but if you think the table needs to be expanded to describe the time per operation then you might as well describe the time per operation and not the time per aggregated average of some sequences of operations. Second, the Fibonacci heap does take O(log V) to remove the minimum vertex (that, rather than finding the minimum, is the slow part for both it and the binary heap) and constant time to do an edge update. But both of those time bounds are amortized, meaning again that it's true only when averaged over a sequence of operations. Individual remove-min or edge updates can take much longer.
If you're going to update this, I think it would make sense to include also a k-ary heap. There seems to be no WP article on it but it's just like a binary heap with a higher branching factor. By using a branching factor of k, one can make the edge update time be O(log(V)/log(k)) at the expense of making the remove-min time go up to O(k log(V)/log(k)) — choosing k=E/V gives an algorithm with total time O(E log(V)/log(E/V)). It's not as good as Fibonacci theoretically but much simpler in practice. —David Eppstein 03:18, 20 February 2007 (UTC)

[edit] Isn't there anything wrong in the example ?

Please be certain about this sentence : "vertex D has been arbitrarily chosen as a starting point." , i think the starting point is A not D, if i am right , then you have to change the description of the second step in addition to the first step : i suggest the following :


This is our original weighted graph. This is not a tree because the definition of a tree requires that there are no circuits and this diagram contains circuits. A more correct name for this diagram would be a graph or a network. The numbers near the arcs indicate their weight. None of the arcs are highlighted, and vertex A has been arbitrarily chosen as a starting point.


The second chosen vertex is the vertex nearest to A: D is 5 away, B is 7. Of these, 5 is the smallest, so we highlight the vertex D and the arc DA. --Tameralkuly 18:10, 26 January 2007 (UTC)