Mixed logit
From Wikipedia, the free encyclopedia
This article may be too technical for most readers to understand, and needs attention from an expert on its subject. Please expand it to make it accessible to non-experts, without removing the technical details. |
Mixed logit is a fully general, statistical model for approximating utility functions. The inspration for the mixed logit model came from the limitatations of the standard logit, and probit models. The standard logit model has three problem which mixed logit solves. "It [Mixed Logit] obviates the three limitations of standard logit by allowing for random taste variation, unrestriced substitution patterns, and correlation in unobserved factors over time."[1] Mixed Logit can also take any distrbution unlike probit which is limited to the normal distribution.
Contents |
[edit] Random taste variation
The standard logit model's "taste" cofficients, or betas, are fixed, which means the betas are the same for everyone. Mixed logit has different betas for each respondent or person.
The utility of person n for alternative i with the standard logit model is:
with
- ~ iid extreme value
The utility of person n for alternative i with the mixed logit model is:
[[ ]] with
- ~ iid extreme value
where θ is the distribution parameters over the population. It is also called random coefficient model since βn is a random variable. It allows for the slope of the model to be random, an extension from the random effects model where only the intercept was stochastic.
The distribution of the probability density function of the parameters over the population can be modeled with a variety of distributions. This allows the programmer more flexibility then with probit, where the distribution is fixed.
[edit] Unrestricted substitution patterns
The mixed logit model does not have a restrictive substitution pattern because unlike logit it is not independent of irrelevant alternatives (IIA). "The percentage change in the probability for one alternative given a change in the mth attribute of another alternative is
where β m is the mth element of β."[2] It can be seen that because the probability of respondent n with respect to alternative i, P ni , is not in the denominator of the integral that, "A ten-percent reduction for one alternative need not imply (as with logit) a ten-percent reduction in each other alternative."[3] As you may notice the relative percentages depend on the likelihood that respondent n will choose alternative i, L ni , versus the likelihood that respondent n will choose alternative j, L nj over various draws of β. Beta depends of which probability density function the research thinks is appropriate for his/her data.
[edit] Correlation in unobserved factors over time
Standard logit does not take into account how utility changes over time. This is a problem if you are using panel data, which is essentially repeated choices over time. By applying a standard logit model to panal data you are making the assumption that whatever you are observing is new everytime you observe it. That is a very unlikely assumption. By taking into account both random taste variation, and correlation in unobserved factors over time the utility for respondent n for alternative i at time t is as follows,
where the subscription t is the time dimension. We still make the logit assumption which is that is i.i.d extreme value. That means that is independent over time, people, and alternatives. is essentially just white noise.
For a normal distribution the βs will have a standard deviation, s, and mean, b. Then the utility equation becomes:
and η is the draws taken from the probability density function. Then that equation becomes:
- Unit = bXnit + eni
In the preceding equation the observed factors are separated from the unobserved factors. Of the unobserved factor the is independent over time, and s η n X nit is not independent over time.
Then the covariance is,
- Cov(eni,enj) = s2XniXnj
Then by adding random coefficients to the explanatory variables, X's, one should be able to get some correlation over time out of the simulation.
[edit] Example
If you observe a series of a decision maker's choices about what coffee maker he/she buys every time they need a coffee maker, then the probability of that sequence of choices is simply the product of the logit probability of each individual purchase/choice. (Assuming the error term, , is i.i.d extreme value.)
Which mathematically looks like the following,
Then the probability is simply the integral of the product of the logits over the density of &beta.
Unfortunately there is no explicit solution to this equation so the researcher must simulate P ni. Fortunately for the research simulating P ni can be very simple for certain distributions. It can also be very difficult for other distributions. There are four basic steps to follow
1. Take draws from the probability density function that you assigned to the 'taste' parameter.
2. Calculate Ln (βn) (The conditional probability.) This is done for each alternative, and the highest utility is identified.
3. Repeat many times.
4. Average the results
Then the formula for the simulation look like the following,
where R is the total number of draws taken from the distribution, and r is one draw.
Once this is done you will have a value for the probability of each alternative i for each respondent n.
[edit] References
- ^ CB495-06Drv.tex
- ^ Train, Kenneth
- ^ Train, Kenneth