(3.137.214.69)
Users online: 12385     
Ijournet
Email id
 

International Journal Of Data Mining And Emerging Technologies
Year : 2011, Volume : 1, Issue : 2
First page : ( 54) Last page : ( 60)
Print ISSN : 2249-3212. Online ISSN : 2249-3220.
Article DOI : 10.5958/j.2249-3212.1.2.2

Cancer Classification from Microarray Data using Gene Feature Ranking

Hasan Abid1,*Lecturer, Maruf Golam Morshed2Lecturer, MD Shareef3BSc Student, Mamun Hawlader Abdullah Al4PhD Student, Kawn Paul4Senior Lecturer

1Department of CIT, Islamic University of Technology, Bangladesh.

2Department of CSE, United International University, Bangladesh.

3Islamic University of Technology, Bangladesh.

4School of Science and Technology, University of New England Armidale, NSW  2351, Australia.

*E-mail id: abid4en@gmail.com

Online published on 6 February, 2012.

Abstract

A significant challenge in DNA (Deoxyribo Nucleic Acid) microarray analysis can be attributed to the problem of having a large number of features (genes) but with a small number of samples in the dataset. When applying statistical methods to analyse the microarray data, particular care is required to deal with problem such as the low classification accuracy of models brought about by the small number of features that have predictive capability. To overcome these problems, proper approaches for data normalisation, feature reduction, and identifying the optimal set of genes are critical. In this paper, we apply the Gene Feature Ranking [5] method to select genes with high trust values from high dimensional cancer microarray datasets. Our contribution lies in the use of a different metric for calculating the trust values that are more domain specific for cancer datasets. By choosing a pre-defined threshold based on user's knowledge, only genes that show sufficient trustworthiness to be considered for constructing the classification model are retained. Through experimentation on three microarray datasets, namely Acute Lymphoblastic Leukemia (ALL), lymph node negative primary breast cancer, and High Grade Glioma, we are able to confirm that the classification accuracy obtained by the genes selected by the modified GFR method is consistently higher than when the method was not used.

Top

Keywords

Microarray data, Gene Feature Ranking, Acute Lymphoblastic Leukemia, Breast Cancer, High Grade Glioma.

Top

  
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
750,959,926 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.