Gittins index

From Wikipedia, the free encyclopedia

The Gittins index is commonly related to the classic two armed bandit problem. The objective is to play the arms, one at a time, in any order, so as to maximise the expected discounted earnings. The critical factors are that the player doesn't know the probabilities of either arms to begin with, and can only gain knowledge of the probabilities by actually playing the machine.

In simple terms, the value of the probability that a player will be indifferent to playing only one of the arms forever (given a known probability for that arm), as opposed to at least trying the other arm, with the option of switching at some future time and continuing with that arm forever, is the value of the Gittins index.

A good example might be attempting to pick which of two emerging technologies is likely to be successful in the long run. Each technology improves through learning, which can imply that the technology that has a head start may appear superior initially. The learning curve may reinforce the perception of superiority despite the fact that a technology turns out to be inferior in the long run — compare beta versus VHS video formats. Only through implementing both technologies can the true answer be known.