(3.140.195.28)
Users online: 13328     
Ijournet
Email id
 

International Journal of Data Mining And Emerging Technologies
Year : 2019, Volume : 9, Issue : 2
First page : ( 33) Last page : ( 42)
Print ISSN : 2249-3212. Online ISSN : 2249-3220.
Article DOI : 10.5958/2249-3220.2019.00005.3

Model Evaluation and Classification of Tamil Articles using WEKA Data Mining Techniques

Manimannan G.1,*, Priya R. Lakshmi2

1Assistant Professor, Department of Mathematics, TMG College of Arts and Science, Chennai, India

2Assistant Professor, Department of Statistics, Dr. Ambedkar Govt. Arts College, Chennai, India

*Corresponding author email id: manimannang@gmail.com

Online published on 21 May, 2020.

Abstract

The present paper deals with the classification and cross validation of articles of three different authors written by contemporary Tamil scholars of the same period, namely Mahakavi Bharathiar (MB), Subramniya Iyer (SI), and T.V. Kalyanasundaranar (TVK) using certain Data Mining Techniques. These three popular scholars have written number of articles on India's Freedom Movement during the pre-independence period and published in the magazine called, India. In this research paper, the assignment of articles of Mahakavi Bharathiar (MB), Subramaniya Iyer (SI), and T.V. Kalyanasundaranam (TVK) is discussed. The application of machine learning method of Naïve Bayes, Multilayer Perceptron, Support Vector Machine and Random Forest as data mining tools to explore the classification model and cross validate in the present dataset structure. All the authors of classification models were extracted, Rate of True Positive (RTP), Rate of False Positive (RFP), Precision, F-Measure, Mathews Correlation Coefficients (MCC), Receiver Operation Characteristic (ROC) Area, Precision-Recall Curves (PRC) and Class are all closer to unity. Different sets of stylistic features of three Tamil Scholars of the same period also clearly classify styles of these authors using function words. The results of the present study indicate that the machine Learning Data Mining tools can be used as a feasible tool for the analysis of large set of authorship database. Finally, the three author's model writing structure and their stylistic features are classified and visualized using bar chart for every parameters.

Top

Keywords

Authorship, Stylistic features, Data mining, Naïve Baye's, Classification trees, Multilayer perceptron, Random forest and Scatter plot.

Top

  
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
760,659,402 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.