Talk:Dynamic programming
From Wikipedia, the free encyclopedia
[edit] Example
I (FvdP 21:51 Oct 22, 2002 (UTC)) had begun an example but do not feel the motivation to finish it cleanly right now (and perhaps not in the immediate future either). Here is it if anyone wants to complete it::
Suppose that we have to compute the product of n matrices A1.A2. ... .An, and we want to minimize the total number of operations by executing the matrix products in the best possible order.
The product of two matrices of size i × j and j × k
........
This example is published in Cormen's Introduction to Algorithms textbook. Will we run into copyright issues by putting it here?
- Absolutely not! Ideas cannot be copyrighted, only a particular way of expressing those ideas. CLRS itself based this section on various papers. In other words, we can write an explanation of the problem and its solution in our own words, but should be careful not to copy text or even the precise presentation of the problem. Derrick Coetzee 20:09, 5 Oct 2004 (UTC)
[edit] Confusion
The first paragraph of the second part and the article on algorithms states that dynamic programming is a bottom-up approach, but later this article says dynamic programming may use bottom-up or top-down approaches. --zeno 11:46, 9 Feb 2004 (UTC)
- I've revised that section. It was very vague and somewhat incorrect. I hope there's less confusion now. Derrick Coetzee 20:18, 5 Oct 2004 (UTC)
[edit] Added example
I added an example at the Implementations-page, and i need people to check it. It concerns a checkerboard (not matrices) and i think it's correct, but i can't be sure. Should i put it on the main page? Well, read it and give it an honest opinion.
Gkhan 23:45, Jul 26, 2004 (UTC)
- I think the example should be moved to Wikibooks. It is a bit long-winded, and not very encyclopedic stylistically. Fredrik | talk 20:46, 22 May 2005 (UTC)
[edit] Weak analysis of fibonacci running time
The analysis of the nieve fibonacci computation is rather loose. It's not quite as bad as O(2^n), it's really Θ(phi^n)=Θ(fib(n)), and since phi < 2, the computation is grows strictly slower than 2^n.
Given it is still exponential time. I don't know if it is worth the tangent off of the main topic dynamic programming to make the statement tighter, however. Rrenaud 01:56, 3 Dec 2004 (UTC)
- Thanks, I took care of this by just making it say "exponential time" and not be too specific. Deco 19:26, 17 Apr 2005 (UTC)
- I don't want to be nitpicky (is that a word?) but there is nothing wrong with saying that its a O(2^n) algorithm since big-O is just an upper bound. It would be wrong to say Θ(2^n). I personally preferred it the other way, but i guess this way is more accurate. Gkhan 22:15, Apr 17, 2005 (UTC)
- To say it's O(2^n) would be a bit silly though, when the point is that it takes a lot of time. We really want to give a lower bound in this context, not an upper bound. Deco 07:07, 23 May 2005 (UTC)
- I don't want to be nitpicky (is that a word?) but there is nothing wrong with saying that its a O(2^n) algorithm since big-O is just an upper bound. It would be wrong to say Θ(2^n). I personally preferred it the other way, but i guess this way is more accurate. Gkhan 22:15, Apr 17, 2005 (UTC)
Fleizach the example says m = min(a, b, c). then it later says if m = a, path = 1, if m = b, path = 2 and so on. this doesn't take into account if, for example, a and b were the same. the path taken may not be the same path. fixing this would make the code a lot more verbose unfortunately.
[edit] Discrete vs Continous, Deterministic vs. Stohastic
If I remember well from the courses I took long time ago dynamic programming comes in different flavors: discrete vs. continuous and stohastic vs. deterministic. Usually, in computer science we just reduce dynamic programming ot the deterministic, discrete case.
An example of the continuous dynamic programming that I would like to see in Wikipedia is the solution of the problem of landing on the Moon where one has to minimize the amount of rocket fuel for lunar landing of a space ship. Tomo 12:38, 16 Apr 2005 (UTC)
[edit] Figure 1
"Figure 1. Finding the shortest path using optimal substructure; wavy lines indicate shortest path"
Surely it's bold lines that indicate shortest path in this diagram? What are wavy lines representing here? - Chris Wood 15:23, 3 Jun 2005 (UTC)
- The bold lines indicate the shortest path from the start to the goal. The wavy lines indicate the shortest path between the two vertices they connect. Clarifying now. Deco 18:55, 3 Jun 2005 (UTC)
- Just a thought—my edit (reverted, thanks Pm215) should be some indication that the graphic still doesn't make much sense. There's only one path between each pair of vertices, that is, the edge that connects them, and it must be the shortest of, um, itself, right? And just to add confusion, the wavy edges are the three longest in the graph! At the very least, the extra-short edge labeled '2' is clearly short, but it doesn't get any special treatment. I don't think I'm being dense, but this doesn't really make sense to me, and I wonder if I'm the only person staring at the graph who's lost. Patrick O'Leary 04:23, 15 November 2006 (UTC)
- The idea (I believe) is that the right hand side of the figure is the 'solved' subpart of the problem, so the wavy lines indicate the length of the (determined in previous iterations) shortest path through a set of nodes which aren't shown, whereas the straight edges indicate real connections in the small part of the graph we're trying to solve next. I agree that it's not very clear, and given the other problems noted below perhaps we should just drop this example? pm215 12:45, 18 November 2006 (UTC)
- Ahh, that sort of makes sense. Call it left-to-right language bias, but there's no indication that the solution is proceding leftward, and since this is the English-language Wikipedia, some explicit mention of that would seem justified. But at any rate, removing it does seem like a viable option. Patrick O'Leary 18:46, 20 November 2006 (UTC)
- The idea (I believe) is that the right hand side of the figure is the 'solved' subpart of the problem, so the wavy lines indicate the length of the (determined in previous iterations) shortest path through a set of nodes which aren't shown, whereas the straight edges indicate real connections in the small part of the graph we're trying to solve next. I agree that it's not very clear, and given the other problems noted below perhaps we should just drop this example? pm215 12:45, 18 November 2006 (UTC)
- Just a thought—my edit (reverted, thanks Pm215) should be some indication that the graphic still doesn't make much sense. There's only one path between each pair of vertices, that is, the edge that connects them, and it must be the shortest of, um, itself, right? And just to add confusion, the wavy edges are the three longest in the graph! At the very least, the extra-short edge labeled '2' is clearly short, but it doesn't get any special treatment. I don't think I'm being dense, but this doesn't really make sense to me, and I wonder if I'm the only person staring at the graph who's lost. Patrick O'Leary 04:23, 15 November 2006 (UTC)
This figure makes me a little nervous, since it just falls short of implying that one can search for shortest paths in a graph using a straightforward DP (say, memoization: shortestpath(i)). This is not possible, except in directed, acyclic graphs. I suggest altering the image so that the edges are directed (have an arrowhead at one end). Ruberik 02:05, 10 June 2006 (UTC)
- Hmm, you're right about that. For now I'll note it in the text. Deco 02:08, 10 June 2006 (UTC)
- Wow, you're speedy. Thanks; also, I've just realized that the example that refers to the image has a similar issue (at best) or is wrong (at worst). It talks about finding shortest paths from adjacent vertices first, which makes some sense if the graph is a DAG. The problem is that no matter how you slice it, you don't find shortest paths "from all adjacent vertices" first -- if I have a graph where the destination is T and I have nodes A and B neighbouring it, A->B costs 1, A->T costs 20, B->T costs 1, then computing all adjacent vertices in no particular order is not useful. Ruberik 02:26, 10 June 2006 (UTC)
- I'm not really sure how to fix this example; while its heart is in the right place, I don't think it has anything to do with DP. Changing the graph to a DAG helps, but the explanation would need changing to something much more complex (unless someone has a succinct way of doing it). Can anyone come up with a succinct example that fills the same "very brief introduction" purpose as the current one? Coin Changer is a good simple example of DP, but it still takes some space to explain. Ruberik 21:06, 12 June 2006 (UTC)
- Our best bet may be to stick with Fibonnaci throughout. Sorry about that. Deco 03:10, 13 June 2006 (UTC)
- Agreed with Ruberik. It's misleading to start off with an "example" that has nothing to do with DP at all. The diagram is not correct unless the graph is directed and acyclic. Fixing it is no easy task. igor_nav
[edit] Mild correction of pseudocode.
The article's pseudocode was inconsistent. At times, it used a Python-like indenting to represent the code structure. But it also included brackets at times. I have added brackets so that the pseudocode follows a C-like syntax: if the result of a loop, conditional, or whatnot is longer than one line, it is surrounded by brackets.
A more aggressive editor might prefer that no brackets be used. I don't know whether a bracket-less pseudocode would be clearer, but any consistent use of brackets would be clearer than what was there. (Would always using brackets be clearer -- for people who use a text-to-speech synthesizer, for example -- or would it just put in too much punctuation?) Chip Unicorn
- In pseudocode I like to prefer indentation to braces where it's not unclear, to avoid eye-clutter, but I don't disagree with your changes - it's still sparse enough to be readable. Deco 01:10, 23 Jun 2005 (UTC)
-
- I tried it without braces, and it looks visually clearer. If Wikipedia gets complaints that this is un-understandable by some group. we can put in some way to distinguish the levels. Chip Unicorn 12:41, 23 Jun 2005 (UTC)
[edit] Haskell memoization?
The article makes the claim that Haskell automatically memoizes the results of every function call, so that dynamic programming is never necessary. Having just tried to find the 300th Fibonacci number by a naively recursive approach in Hugs, I can assure you that such is not the case, at least not in all implementations. Similar doubts are expressed on the Talk page for Levenshtein distance. In fact, I came across someone's blog entry which talks about memoization in Haskell with an example of memoizing the Fibonacci numbers, contradicting the article's claim that this is done automatically. -- Anonymous, 11 March 2006
[edit] No documents found confirming memoization in Haskell
I would like to add that
- The Haskell 98 Standard does not talk anything about memoization.
- Google search on haskell memoization does not point to any such documentation; it only gives documents talking about techniques to achieve this by programming. (Ofcourse this is not a very good argument!)
- Even the most optimizing Glasgow Haskell Compiler does not perform these kinds of memoizations, as one can verify by checking that the timing of the evaluation of the following naïve Fibbonaci function goes as exponential in n:
fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2)
Thus I think the mention of Haskell memoizing should be removed.
(Ofcourse it is possible to write memoizing functions in Haskell; the point is there seems to be no automatic memoization.)
-- Abhay Parvate 06:20, 13 July 2006 (UTC)
[edit] Old-fashioned Terminology
The word "Dynamic" makes most programmers think of dynamic programming language and/or Dynamic typing, or even dynamic compilation. Of course, Dynamic Programming was developed and named long before any of these (in fact, before any kind of compilation!) Maybe we should say something about this? Here's my first draft; improvements welcome!
- Dynamic Programming is not related to dynamic programming languages nor to dynamic compilation, which were developed later.
This could be appended to either of the introductory paragraphs. I'd like some feedback before I insert this. Chris Chittleborough 08:53, 26 March 2006 (UTC)
I like to mention that Sean R. Eddy described how the term was coined. It was meant to shield Richard Bellman's work from US Secretary of Defense Charles Wilson who was hostile to math resarch. http://www.nature.com/nbt/journal/v22/n7/full/nbt0704-909.html -Huggie
[edit] Greedy vs Dynamic Programming
A few of the algorithms listed here, I believe, are not referred to as dynamic programming.
Certainly Dijkstra's shortest-path algorithm is not; it's more of a greedy algorithm. I believe Bellman-Ford might also be wrongly listed here, but I'm not as confident about that.
The n^2 version of Dijkstra's could be considered dynamic programming, though the far more common m + n log n version is surely not. I would term Bellman-Ford DP because it can be thought of as filling in a two-dimensional table, where T[i][j] is the shortest path to i using less than or equal to j edges. Since we don't care about the exact value of j, we tend to squash out that dimension of the array. One can think of Floyd-Warshall in a similar way.
According to section 24.3 of the CLRS Introduction to Algorithms book, MIT Press, Dijkstra's single source shortest-paths algorithm uses a greedy approach. This is quite logical given the fact that we continually add the locally optimal vertex to the set of solved vertices at each step. I have thus removed it from the list of DP algorithms. Also, the n^2 version of Dijkstra's algorithm just doesn't use a priority queue to sort the vertices (it has an O(n) extract-min operation instead). Also with regard to Bellman-Ford, the fact that it fills in a table T[i][j] doesn't make it a Dynamic Programming algorithm. In fact, I would almost call Bellman-Ford a brute-force algorithm, because for each vertex, it goes through each edge (O(VE)) relaxing every single edge V times...it seems almost crude.
- Dikjstra is not DP because it fills out a table. It is DP because it solves recursively smaller sub-problems (one node less). It is greedy in the way it choses which sub-problem to solve (kind of backwards, it chooses superproblems). Because of the way it chooses, it needs to do less work (one relax for each node). -- Nils Grimsmo 21:29, 3 August 2006 (UTC)
[edit] Weirdness
The page says "i dont know" in those terms, which is not only weirdly misplaced, but also improperly formatted. Someone check.
[edit] Smith-Waterman Algorithm
I don't see any mention of the Smith-Waterman algorithm, which is a dynamic programming algorithm just like the Needleman-Wunsch. Would it be appropriate to add that one as well? --scskowron 13:57 , 13 December 2006 (UTC)
[edit] Bellman equation
Principle of optimality links here, but there is no mention what the principle is. The principle, as I understand it, is basically the equivalence of certain dynamic programming problems to the Bellman equation (which should probably be mentioned here, as well, or Hamilton-Jacobi-Bellman equation). Smmurphy(Talk) 00:53, 26 March 2007 (UTC)
- I put the BE into this article, and the PoO into the BE article. I'll change the PoO redirect to BE. Smmurphy(Talk) 02:27, 26 March 2007 (UTC)
[edit] Figure 1?
The caption on Figure 1 still doesn't make any sense. There's at most one path between any two verticies. Why aren't all the lines wavy, then? It seems like the figure should be replaced; it doesn't do much to support the artilce as is, and has been confusing readers for almost two years! -- Mikeblas 06:29, 27 March 2007 (UTC)
- I'd like to clarify the diagram, but I don't understand your objections at all. Why would there be at most one path between any two vertices? In most graphs there's many paths between any two chosen vertices. Why should all the lines be wavy? The other lines represent single edges, not paths. I also disagree with the claim that it doesn't add much, there are no other simple examples of optimal substructure presented here. The text provides some additional explanation, but if there's anything I can do to clarify it further I'd be happy to. Dcoetzee 20:27, 31 March 2007 (UTC)
- I think I might know what you're not clear about now. This is not a graph on 5 vertices; the wavy lines indicate paths that may include any number of nodes. The nodes on those paths are diagramatically omitted because they don't matter, only the total length of the path matters. Only the neighbours of the start node, those immediately adjacent to it, are shown. Dcoetzee 21:00, 31 March 2007 (UTC)
[edit] Problem with Checkerboard Example
In the checkerboard example, the indices are reversed. As indicated c(i,j) should reference the ith row and the jth column, however, as far as I can tell the ith column and jth row are referenced instead. This error is propagated throughout the example.
[edit] Misplaced section
The part following "The steps for using dynamic program goes as follows:" is out of place. Out of nowhere the article starts using terms like 'gap penalties' and 'sequence alignment.' Seems like this was copied from some article on a specific application, such as protien sequencing. Anon, Wed Jul 11 14:11:43 EDT 2007
[edit] Top-down approach
[quote] Top-down approach: The problem is broken into subproblems, and these subproblems are solved and the solutions remembered, in case they need to be solved again. This is recursion and memoization combined together. [/quote]
I thought the top-down approach wasn't part of dynamic programming. I thought it was used by Greedy Algorithms —Preceding unsigned comment added by 84.192.157.164 (talk) 18:12, 5 January 2008 (UTC)