An Overview of Web Data Extraction Techniques Devika K*, Surendran Subu** Department of Computer Science and Engineering, SCT College of Engineering, Trivandrum, Kerala *k_devu@yahoo.co.in
**subusurendran@gmail.com
Online published on 4 November, 2017. Abstract Web pages are usually generated for visualization not for data exchange. Each page may contain several groups of structured data. Web pages are generated by plugging data values to predefined templates. Manual data extraction from semi supervised web pages is a difficult task. This paper focuses on study of various automatic web data extraction techniques. There are mainly two types of techniques one is based on wrapper induction another is automatic extraction. In wrapper induction set of extraction rules are used, which are learnt from multiple pages containing similar data records. Top Keywords Data extraction, wrapper induction, DOM tree, web crawler, Data alignment, pattern mining. Top |
|
Access denied
Your current subscription does not entitle you to view this content or Abstract is unavailable, the access to full-text of this Article/Journal has been denied. For Information regarding subscription please click here.
For a comprehensive list of other publications available on IJour.net please click here
or, You can subscribe other items from IJour.net (Click here to see other items list.)
Top