Multi-label classification

In machine learning, multi-label classification is a variant of the classification problem where multiple target labels must be assigned to each instance. Multi-label classification should not be confused with multiclass classification, which is the problem is categorizing instances into more than two classes.

There are two main methods for tackling the multi-label classification problem^[1]: problem transformation methods and algorithm adaptation methods. Problem transformation methods transform the multi-label problem into a set of binary classification problems. Algorithm adaptation methods adapt the algorithms to directly perform multi-label classification.

Several problem transformation methods exist for multi-label classification; a common one is the binary relevance (BR) where one binary classifier is trained per label. Various other transformations exist: The Label Combinations (LC) transformation, creates one binary classifier for every possible label combination. Other transformation methods include RAkEL^[2] and Chain Classifiers(CC)^[3]. Various problem transformation methods have been developed such as Ml-kNN^[4], a variant of the k-nearest neighbors lazy classifiers.

Multi-label Classification Metrics

Metrics for multi-label classification are inherently different from those used in multi-class (or binary) classification, due to the inherent differences of the classification problem. The following metrics are typically used:

Hamming Loss: is the percentage of the wrong labels to the total number of labels. As a loss metric, 0 is better.
Label-based Accuracy
Exact Match: is the most strict metric, indicating the percentage of samples that have all their labels classified correctly.

Implementations and datasets

Implementations of multi-label algorithms are available in the Mulan and Meka software packages, both based on Weka.

A list of commonly used multi-label data-sets is available at the Mulan website.

References

^ Grigorios Tsoumakas, Ioannis Katakis. Multi-Label Classification: An Overview. International Journal of Data Warehousing & Mining, 3(3), 1-13, July-September 2007.
^ Konstantinos Trohidis, Grigorios Tsoumakas, George Kalliris, Ioannis Vlahavas Multi-label Classification of Music into emotions ISMIR 2008
^ Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank. Classifier Chains for Multi-label Classification. Machine Learning Journal. Springer. Vol. 85(3), (2011).
^ Zhang, M.L. and Zhou, Z.H. ML-KNN: A lazy learning approach to multi-label learning