(18.219.86.155)
Users online: 3035     
Ijournet
Email id
 

International Journal of Scientific Engineering and Technology
Year : 2013, Volume : 2, Issue : 4
First page : ( 278) Last page : ( 287)
Online ISSN : 2277-1581.

An Overview of Web Data Extraction Techniques

Devika K*, Surendran Subu**

Department of Computer Science and Engineering, SCT College of Engineering, Trivandrum, Kerala

*k_devu@yahoo.co.in

**subusurendran@gmail.com

Online published on 4 November, 2017.

Abstract

Web pages are usually generated for visualization not for data exchange. Each page may contain several groups of structured data. Web pages are generated by plugging data values to predefined templates. Manual data extraction from semi supervised web pages is a difficult task. This paper focuses on study of various automatic web data extraction techniques. There are mainly two types of techniques one is based on wrapper induction another is automatic extraction. In wrapper induction set of extraction rules are used, which are learnt from multiple pages containing similar data records.

Top

Keywords

Data extraction, wrapper induction, DOM tree, web crawler, Data alignment, pattern mining.

Top

  
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
746,685,326 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.