Information Gain Feature Selection for Multi-Label Classification

Authors

  • Rafael B. Pereira Universidade Federal Fluminense (UFF) http://orcid.org/0000-0001-5181-6569
  • Alexandre Plastino Universidade Federal Fluminense
  • Bianca Zadrozny IBM Research, Brazil
  • Luiz H. C. Merschmann Universidade Federal de Ouro Preto (UFOP)

Keywords:

classification, data mining, feature selection, multi-label classification

Abstract

In many important application domains, such as text categorization, biomolecular analysis, scene or video classification and medical diagnosis, instances are naturally associated with more than one class label, giving rise to multi-label classification problems. This fact has led, in recent years, to a substantial amount of research in multi-label classification. And, more specifically, many feature selection methods have been developed to allow the identification of relevant and informative features for multi-label classification. However, most methods proposed for this task rely on the transformation of the multi-label data set into a single-label one. Besides, there is no single work that carries out a comprehensive evaluation of the various multi-label classification techniques coupled with feature selection methods over data sets from different domains. In this work, we perform these experimental evaluations, and also propose an adaptation of the information gain feature selection technique to handle multi-label data directly.

Downloads

Published

2015-10-12