Random multinomial logit

From Wikipedia, the free encyclopedia

Contents

[edit] Introduction

Random Multinomial Logit (RMNL) is a statistical technique for (multi-class) classification using repeated multinomial logit analyses inspired on the principles of Random forests, developed by Leo Breiman.

[edit] Rationale for the new method

Several learning algorithms have been proposed to handle multiclass classification. While some algorithms are merely an extension or combination of intrinsically binary classification methods (e.g. multiclass classifiers as one-versus-one or one-versus-all binary classifiers), other algorithms like Multinomial logit (MNL) are specifically designed to map features to a multiclass output vector. MNL’s robustness is greatly appreciated and therefore, MNL has a proven track record in many disciplines amongst them transportation research and CRM (Customer relationship management). Unfortunately, MNL suffers from the curse of dimensionality thereby implicitly necessitating feature selection, i.e., the selection of a best subset of variables of the input feature set. In contrast to binary logit, to date, software packages mostly lack any feature selection algorithm for MNL. This absence constitutes a serious problem for several application areas. Recently, Random forests, i.e., a classifier combining a forest of decision trees grown on random input vectors and splitting nodes on a random subset of features, have been introduced for the classification of binary and multiclass outputs. Feature selection is implicitly incorporated during each tree construction. At each node of one of the decision trees in the forest, the best variable to split on out of a random subset of variables is selected. During classification, just those features needed for the test pattern under consideration are involved. Given Random Forests’ robustness and competence for analyzing large feature spaces and MNLs weakness in the latter, why not applying the Random Forests approach to MNL, i.e. building a forest of MNLs, to unite the best of both worlds? To this end, Prinzie & Van den Poel (2008) propose a new method, the Random MultiNomial Logit (RMNL), a Random Forest of MultiNomial Logits.

[edit] Application

The developers of the RMNL technique (Prinzie & Van den Poel, 2008) show in their application paper the usefulness of the technique for cross-sell analysis in customer relationship management.

[edit] References

Prinzie A. & Van den Poel D. (2008), Random Forests for Multi-Class Classification: Random Multinomial Logit, Expert Systems with Applications, Vol. 35, No. 3, Forthcoming.

[edit] See Also