Talk:Multi-armed bandit

From Wikipedia, the free encyclopedia

[edit] Epsilon greedy

The description of ε-greedy strategy has been removed. Is-it because it is expected that all strategies (there are a lot of them for the bandit problem) must be completed at once (or completed here before committing into the article)? --Vermorel 15:39, 8 October 2005 (UTC)

[edit] Gittins Index

I am told that a useful tool for Multi-armed Bandit Problems is the Gittins Index (whatever that is). Perhaps someone could add a definition and explanation. Encyclops 22:03, 27 January 2007 (UTC)

  • This paper should be a good reference: J.C. Gittens. A dynamic allocation index for the discounted multi-armed bandit problem. Biometrika (1979), pp. 580-597 Sancho (talk) 03:55, 14 March 2007 (UTC)

[edit] Strategies to be completed

There's no point in having "to be completed" work-in-progress notices for almost two years. I removed these:

to be completed: Probability matching strategies (softmax, gaussmatch, Exp3), Pricing strategies (interval estimation, poker)
Agentbla (talk) 22:45, 4 September 2007 (UTC)