Talk:Markov decision process
From Wikipedia, the free encyclopedia
It would also be nice to have a section on Semi-Markov Decision Processes. (This extension to MDP is particularly important for intrinsically motivated RL and temporal abstraction in RL.) Npdoty 01:33, 24 April 2006 (UTC)
It would be nice to hear about partially observable MDPs as well! --Michael Stone 22:59, 23 May 2005 (UTC)
- Not to mention a link to Markov chain! I've been meaning to expand this article, but I'm trying to decide how best to do it. Anything that can go in Markov chain should go there, and only stuff specific to decision processes should go here, but there will need to be some frequent cross-reference. I think eventually POMDPs should get their own article, as well, which should similarly avoid duplicating material in Hidden Markov model. --Delirium 06:15, 13 November 2005 (UTC)
[edit] What is γ
The constant γ is used but never defined. What is it?
- Well, at the first usage it's described as "discounting factor γ (usually just under 1)", which pretty much defines it - do you think it needs to be more prominent than that?
- This is now fixed in the article
[edit] Invented by Howard?
"They were invented by Ronald A. Howard in 1960"
Is that right? Is "invent" the proper term? Also, weren't there works on MDPs (even if with other names) before 1960?
- Stochastic games were introduced already in [1]. Since they are more general than MDPs, I would be surprised if MDPs were not used even earlier than that.
- ^ Shapley, L.S.: "Stochastic Games", pages 1095--1100. In Proceedings of the National Academy of Sciences 39(10), 1953
—The preceding unsigned comment was added by Svensandberg (talk • contribs) 13:31, 9 January 2007 (UTC).
-
- "Invent" may not be the right word. However, Howard's book was very important. In E. V. Denardo's book "Dynamic Programming" he does mention Shapley (1953) but adds "a lovely book by Howard (1960) highlighted policy iteration and aroused great interest in this model". So that book set off a lot of subsequent research. And it is still a classic. Feel free to replace the word "invent" with another more appropriate... Encyclops 22:58, 9 January 2007 (UTC)
-
-
- I rewrote that a bit and added a reference to Bellman 1957 (which I found in Howard's book). OK? Svensandberg 16:31, 17 January 2007 (UTC)
-