(3.14.141.17)
Users online: 10813     
Ijournet
Email id
 

Year : 2020, Volume : 10, Issue : 1
First page : ( 8) Last page : ( 18)
Print ISSN : 2249-3212. Online ISSN : 2249-3220. Published online : 2020 May 12.
Article DOI : 10.5958/2249-3220.2020.00002.6

Comparative Study of Data Mining Classifiers with Different Features and Different Databases Domain

Manimannan G.1*, Priya R. Lakshmi2, Arumugam A.3, Poompavai A.4

1Assistant Professor, Department of Mathematics, TMG College of Arts and Science. Chennai, India

2Assistant Professor, Department of Statistics, Dr. Ambedkar Govt. Arts College, Chennai, India

3Associate Professor, Department of Statistics, Annamalai University, Chidhambaram, India

4Assistant Professor, Department of Statistics, Apollo Arts and Science College, Chennai, India

*Corresponding author email id: manimannang@gmail.com

Received:  22  ,  2020; Accepted:  19  ,  2020.

Abstract

In this paper, an attempt is made to identify and cross validate with three different classification methods in terms of precision, accuracy and kappa statistics calculated and visualized with different sets of databases collected from different domains. This research paper has been implemented in R programming language environment and the obtained results show that which classifier is the most robust classifier method. The Accuracy based comparison of different classification for different datasets have been shown. By confusion matrix sensitivity, specificity, accuracy, true positive rate and false positive rate of different classifiers for all three datasets are calculated and comparison of Kappa Statistics is also performed. The present work is about to analyze the effectiveness of the most popular classification techniques. According to the Experimental results, the Support Vector Machine model proved to have the best performance. It performed better of all the datasets used. Naive Bayes Classifier and Random Forest also performed well. The true positive rate and false positive rate table represent above 80% True Positive Rate and less than 20% False Positive Rate for all four datasets. Kappa Statistics basically performs the analysis between different classes. This shows the comparative analysis of different classification under the kappa statistics. Higher Value of kappa statistics is considered as good.

Top

Keywords

Random forest, Naive baye’s classifier, Support vector machine, Confusion matrix, Kappa statistics.

Top

  
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
750,827,851 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.