A semantic crawler based on an extended CBR algorithm

A semantic (web) crawler refers to a series of web crawlers designed for harvesting semantic web content. This paper presents the framework of a semantic crawler that can abstract metadata from online webpages and cluster the metadata by associating them with ontological concepts. The clustering is...

Full description

Bibliographic Details
Main Authors: Dong, Hai, Hussain, Farookh Khadeer, Chang, Elizabeth
Other Authors: R. Meersman
Format: Conference Paper
Published: Springer 2008
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/36321
Description
Summary:A semantic (web) crawler refers to a series of web crawlers designed for harvesting semantic web content. This paper presents the framework of a semantic crawler that can abstract metadata from online webpages and cluster the metadata by associating them with ontological concepts. The clustering is based on a CBR algorithm which is adopted in the field of problem solving. We reveal the technical details with regard to ontological concept and metadata format, and the extended CBR algorithm. In addition, the system implementation and evaluation details are provided in detail, finalized by our conclusion and further works.