State of the art in metadata abstraction crawlers

Nowadays, the research of crawlers moves closer to the semantic web, along with the appearance of increasing XML/RDF/OWL files and the rapid development of ontology mark-up languages. As an emerging concept, metadata abstraction crawlers are a series of crawlers that aim to abstract metadata from no...

Full description

Bibliographic Details
Main Authors: Dong, Hai, Hussain, Farookh Khadeer, Chang, Elizabeth
Other Authors: Hubert Wo
Format: Conference Paper
Published: Institute of Electrical and Electronics Engineers (IEEE) 2008
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/48258
_version_ 1848758059841617920
author Dong, Hai
Hussain, Farookh Khadeer
Chang, Elizabeth
author2 Hubert Wo
author_facet Hubert Wo
Dong, Hai
Hussain, Farookh Khadeer
Chang, Elizabeth
author_sort Dong, Hai
building Curtin Institutional Repository
collection Online Access
description Nowadays, the research of crawlers moves closer to the semantic web, along with the appearance of increasing XML/RDF/OWL files and the rapid development of ontology mark-up languages. As an emerging concept, metadata abstraction crawlers are a series of crawlers that aim to abstract metadata from normal HTML documents, based on various semantic web technologies. In this paper, we make a general survey of the current situation of metadata abstraction crawlers. Fourteen cases in this field are chosen as typical examples, and classified in five clusters. From seven perspectives we horizontally compare and contrast the semantic web crawlers in each cluster, and draw our conclusion in the final section.
first_indexed 2025-11-14T09:37:58Z
format Conference Paper
id curtin-20.500.11937-48258
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T09:37:58Z
publishDate 2008
publisher Institute of Electrical and Electronics Engineers (IEEE)
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-482582022-12-07T06:50:51Z State of the art in metadata abstraction crawlers Dong, Hai Hussain, Farookh Khadeer Chang, Elizabeth Hubert Wo Heping Xie RDF crawlers OAI-PMH semantic web crawlers Focused crawlers metadata abstraction Nowadays, the research of crawlers moves closer to the semantic web, along with the appearance of increasing XML/RDF/OWL files and the rapid development of ontology mark-up languages. As an emerging concept, metadata abstraction crawlers are a series of crawlers that aim to abstract metadata from normal HTML documents, based on various semantic web technologies. In this paper, we make a general survey of the current situation of metadata abstraction crawlers. Fourteen cases in this field are chosen as typical examples, and classified in five clusters. From seven perspectives we horizontally compare and contrast the semantic web crawlers in each cluster, and draw our conclusion in the final section. 2008 Conference Paper http://hdl.handle.net/20.500.11937/48258 10.1109/ICIT.2008.4608573 Institute of Electrical and Electronics Engineers (IEEE) fulltext
spellingShingle RDF crawlers
OAI-PMH
semantic web crawlers
Focused crawlers
metadata abstraction
Dong, Hai
Hussain, Farookh Khadeer
Chang, Elizabeth
State of the art in metadata abstraction crawlers
title State of the art in metadata abstraction crawlers
title_full State of the art in metadata abstraction crawlers
title_fullStr State of the art in metadata abstraction crawlers
title_full_unstemmed State of the art in metadata abstraction crawlers
title_short State of the art in metadata abstraction crawlers
title_sort state of the art in metadata abstraction crawlers
topic RDF crawlers
OAI-PMH
semantic web crawlers
Focused crawlers
metadata abstraction
url http://hdl.handle.net/20.500.11937/48258