Ensemble learning for handling imbalanced datasets with the combination of bagging and sampling methods Rout Neelam1,*, Kuhoo2, Mishra Debahuti3, Mallick Manas Kumar3 1Research Scholar, Department of Computer Science and Engineering, Siksha ‘O'Anusandhan Deemed to be University, Bhubaneswar, Odisha, India 2Student, Department of Mechanical Engineering, College of Engineering and Technology, Bhubaneswar, Odisha, India 3Professor, Department of Computer Science and Engineering, Siksha ‘O'Anusandhan Deemed to be University, Bhubaneswar, Odisha, India *Corresponding Author: Neelam Rout Research Scholar, Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India. neelamrout@soa.ac.in
Online published on 16 October, 2018. Abstract The dataset which suffers from imbalanced class distributions is a major problem for the classifiers in data mining. The problem occurs when the number of instances of the class or classes of interest is very lower than the other class or classes. It hampers many real-world applications. In this study, the authors have used the Bagging ensemble strategy combined with sampling methods to deal with the imbalanced data. These combined strategies have been explored and compared with the other state-of-the-art similar methods and the results are analyzed, statistical tests are performed to know the best Ensemble method. After the Wilcoxon test, it is proved that Over Bagging is the best performing method among all. Top Keywords Imbalanced Learning, Ensemble Method, Performance Measure, Statistical Tests. Top |