Automatic Threshold Selections using Artificial Bee Colony in Duplicate Record Detection Deepa K.*, Vivek C.**, Rajan S. Palanivel** *Sri Ramakrishna Engineering College, Coimbatore, India **Department of Electronics and Communication Engineering, M. Kumarasamy College of Engineering, Karur, India Online published on 14 October, 2016. Abstract Most of the deduplication process requires similarity function which address whether the two entries are duplicate or not by setting the threshold. Setting the threshold is an important issue and it relies more on human intervention. The proposed work addressed two problems: first to find the optimal equation using Genetic Algorithm(GA) and next it adopts an intelligence algorithm, Artificial Bee Colony (ABC) to get the optimal threshold to detect the duplicate records more accurately and also it reduces human intervention. Two dataset Restaurant and CORA are considered to analyze the proposed algorithm. Top Keywords GA, ABC, Similarity metrics, Cosine Similarity, Levenshtein Distance. Top |
|
Access denied
Your current subscription does not entitle you to view this content or Abstract is unavailable, the access to full-text of this Article/Journal has been denied. For Information regarding subscription please click here.
For a comprehensive list of other publications available on IJour.net please click here
or, You can subscribe other items from IJour.net (Click here to see other items list.)
Top