Hierarchical Bottom-Up Safe Semi-Supervised Support Vector Machines for Multi-Class Transductive Learning

  • Thiago F. Covões University of Sao Paulo (USP) at Sao Carlos
  • Rodrigo C. Barros University of Sao Paulo (USP) at Sao Carlos
  • Tiago S. da Silva University of Sao Paulo (USP) at Sao Carlos
  • Eduardo R. Hruschka University of Sao Paulo (USP) at Sao Carlos
  • Andre C. P. L. F. de Carvalho University of Sao Paulo (USP) at Sao Carlos
Keywords: multi-class SVMs, safe support vector machines, semi-supervised learning, transductive learning

Abstract

Semi-supervised approaches have been successfully applied to many machine learning problems. A particular case of semi-supervised settings is transductive learning, in which the goal is solely to label the available unlabeled data, instead of generating a predictive model that should generalize well to unseen data. Nevertheless, there are cases in which a transductive learner may perform even worse than an inductive learner that is trained with only the labeled data. For alleviating this problem, an effort towards safe semi-supervised support vector machines (S4VM) was made, so that a transductive SVM would never degenerate the performance when compared to its inductive counterpart. Even though robust, S4VM still lacks the ability of naturally dealing with multi-class problems, having to rely on multi- class encoding schemes, such as one-vs-one and one-vs-all strategies. These schemes may not take advantage of the full potential of S4VM, specially in complex multi-class problems with overlapping classes. In this article, we address this problem by providing a binary-tree scheme for aggregating distinct S4VMs in a bottom-up fashion. The proposed approach is named HiBUST, which stands for Hierarchical Bottom-Up S4VM Tree. Experimental results show that HiBUST can provide increased predictive performance for many multi-class problems when compared to a one-vs-one S4VM. In addition, we show that HiBUST is also beneficial for complex binary-class problems.

Author Biographies

Thiago F. Covões, University of Sao Paulo (USP) at Sao Carlos
Thiago Ferreira Covões received his B.Sc. degree in Computer Science from the University of Santos, Brazil, in 2007, and his Master of Science degree in Computer Science from the University of São Paulo (USP) at São Carlos, Brazil, in 2010. He is currently a PhD candidate at USP São Carlos, Brazil. His research interest is data mining, with emphasis on clustering algorithms, semi-supervised learning, feature selection, and classification.
Rodrigo C. Barros, University of Sao Paulo (USP) at Sao Carlos
Rodrigo Coelho Barros received the B.Sc. degree from Universidade Federal de Pelotas, Pelotas, Brazil, and the M.Sc. degree from Pontifícia Universidade Catolica do Rio Grande do Sul, Porto Alegre, Brazil, both in computer science, in 2007 and 2009, respectively. He is currently working toward the Ph.D. degree in computer science with University of Sao Paulo, where he works with machine learning and data mining topics. He has published papers in peer-reviewed journals and conferences. His current research interests include machine learning, data mining, knowledge discovery, and biologically inspired computational intelligence algorithms.
Tiago S. da Silva, University of Sao Paulo (USP) at Sao Carlos
Tiago Silva da Silva received the B.Sc. degree in Systems Analysis from Universidade Catolica de Pelotas, Pelotas, Brazil, in 2005, and his M.Sc. and Ph.D. degrees in Computer Science from Pontifícia Universidade Catolica do Rio Grande do Sul, Porto Alegre, Brazil, in 2008 and 2012, respectively. He is currently assistant professor of the School of Arts, Sciences and Humanities of the University of Sao Paulo, Brazil. His current research interests include empirical software engineering, machine learning, and machine learning applied to software engineering tasks.
Eduardo R. Hruschka, University of Sao Paulo (USP) at Sao Carlos
Eduardo Raul Hruschka received his B.Sc. degree in Civil Engineering from Federal University of Paraná, Brazil, in 1995, and his M.Sc. and Ph.D. degrees in Computational Systems from Federal University of Rio de Janeiro in 1998 and 2001, respectively. He is currently assistant professor of the Department of Computer Sciences of the University of São Paulo, Brazil. He has authored or coauthored more than 50 research publications in peer-reviewed reputed journals, book chapters, and conference proceedings. Dr. Hruschka has been a reviewer for several journals such as Information Sciences, IEEE TSMC, IEEE TKDE, IEEE TEC, IEEE TNN, Journal of Heuristics, Pattern Recognition Letters, Applied Soft Computing, ACM Transactions on Autonomous and Adaptive Systems, Computational Statistics & Data Analysis, and Bioinformatics.
Andre C. P. L. F. de Carvalho, University of Sao Paulo (USP) at Sao Carlos
Andre C. P. L. F. de Carvalho received the B.Sc. and M.Sc. degrees in computer science from the Universidade Federal de Pernambuco, Recife, Brazil. He received his Ph.D. degree in electronic engineering from the University of Kent, Canterbury, Kent, U.K. He is a Full Professor with the Department of Computer Science, Universidade de Sao Paulo, Sao Carlos, Brazil. He has published around 100 journals and 250 conference refereed papers. He has been involved in the organization of several conferences and journal special issues. His main research interests include machine learning, data mining, bioinformatics, evolutionary computation, bioinspired computing, and hybrid intelligent systems.
Published
2013-09-13
Section
SBBD Articles