(3.145.54.199)
Users online: 5050     
Ijournet
Email id
 

Asian Journal of Research in Social Sciences and Humanities
Year : 2016, Volume : 6, Issue : 10
First page : ( 535) Last page : ( 546)
Online ISSN : 2249-7315.
Article DOI : 10.5958/2249-7315.2016.01032.7

Automatic Threshold Selections using Artificial Bee Colony in Duplicate Record Detection

Deepa K.*, Vivek C.**, Rajan S. Palanivel**

*Sri Ramakrishna Engineering College, Coimbatore, India

**Department of Electronics and Communication Engineering, M. Kumarasamy College of Engineering, Karur, India

Online published on 14 October, 2016.

Abstract

Most of the deduplication process requires similarity function which address whether the two entries are duplicate or not by setting the threshold. Setting the threshold is an important issue and it relies more on human intervention. The proposed work addressed two problems: first to find the optimal equation using Genetic Algorithm(GA) and next it adopts an intelligence algorithm, Artificial Bee Colony (ABC) to get the optimal threshold to detect the duplicate records more accurately and also it reduces human intervention. Two dataset Restaurant and CORA are considered to analyze the proposed algorithm.

Top

Keywords

GA, ABC, Similarity metrics, Cosine Similarity, Levenshtein Distance.

Top

  
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
744,562,221 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.