(3.16.29.209)
Users online: 9381     
Ijournet
Email id
 

Advances in Applied Research
Year : 2009, Volume : 1, Issue : 1
First page : ( 83) Last page : ( 92)
Print ISSN : 0974-3839. Online ISSN : 2349-2104.

Lemmatization and Visualization of Tamil Documents

Prabavathi G.T.

Lecturer (SS) in Computer Science, Department of Computer Science, Gobi Arts & Science College, Gobichettipalayam – 638453. Email: gtpraba@gmail.com

Online published on 11 June, 2014.

Abstract

Powerful methods for interactive exploration and search from collections of textual documents are essential to manage the ever-increasing flood of digital information. This paper deals with lemmatizing Tamil text documents and visualizing the clustered documents for faster retrieval system. Tamil - a language belonging to the south-central branch of the Dravidian languages is highly inflectional which requires huge lemmatization techniques for extracting the correct root word. The mined documents are automatically clustered onto a map in an unsupervised manner through statistical information of word contexts using self-organizing map (SOM) increasing the search efficiency in Tamil digital library collection.

Top

Keywords

Text mining, Stemming, Lemmatization, Self-organizing maps.

Top

 
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
745,787,792 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.