State of the art in metadata abstraction crawlers

Nowadays, the research of crawlers moves closer to the semantic web, along with the appearance of increasing XML/RDF/OWL files and the rapid development of ontology mark-up languages. As an emerging concept, metadata abstraction crawlers are a series of crawlers that aim to abstract metadata from no...

Full description

Bibliographic Details
Main Authors: Dong, Hai, Hussain, Farookh Khadeer, Chang, Elizabeth
Other Authors: Hubert Wo
Format: Conference Paper
Published: Institute of Electrical and Electronics Engineers (IEEE) 2008
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/48258
Description
Summary:Nowadays, the research of crawlers moves closer to the semantic web, along with the appearance of increasing XML/RDF/OWL files and the rapid development of ontology mark-up languages. As an emerging concept, metadata abstraction crawlers are a series of crawlers that aim to abstract metadata from normal HTML documents, based on various semantic web technologies. In this paper, we make a general survey of the current situation of metadata abstraction crawlers. Fourteen cases in this field are chosen as typical examples, and classified in five clusters. From seven perspectives we horizontally compare and contrast the semantic web crawlers in each cluster, and draw our conclusion in the final section.