Mining unordered distance-constrained embedded subtrees

Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance o...

Full description

Bibliographic Details
Main Authors: Hadzic, Fedja, Tan, Henry, Dillon, Tharam S.
Other Authors: J-F. Boulicaut
Format: Conference Paper
Published: Springer 2008
Online Access:http://hdl.handle.net/20.500.11937/28805
_version_ 1848752634632077312
author Hadzic, Fedja
Tan, Henry
Dillon, Tharam S.
author2 J-F. Boulicaut
author_facet J-F. Boulicaut
Hadzic, Fedja
Tan, Henry
Dillon, Tharam S.
author_sort Hadzic, Fedja
building Curtin Institutional Repository
collection Online Access
description Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. Mining distance-constrained unordered embedded subtrees will have important applications in web information systems, conceptual model analysis and more sophisticated knowledge matching. An encoding strategy is presented to efficiently enumerate candidate unordered embedded subtrees taking the distance of nodes relative to the root of the subtree into account. Both synthetic and real-world datasets were used for experimental evaluation and discussion.
first_indexed 2025-11-14T08:11:44Z
format Conference Paper
id curtin-20.500.11937-28805
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T08:11:44Z
publishDate 2008
publisher Springer
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-288052022-11-21T06:47:08Z Mining unordered distance-constrained embedded subtrees Hadzic, Fedja Tan, Henry Dillon, Tharam S. J-F. Boulicaut M. R. Berthold T. Horv�th Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. Mining distance-constrained unordered embedded subtrees will have important applications in web information systems, conceptual model analysis and more sophisticated knowledge matching. An encoding strategy is presented to efficiently enumerate candidate unordered embedded subtrees taking the distance of nodes relative to the root of the subtree into account. Both synthetic and real-world datasets were used for experimental evaluation and discussion. 2008 Conference Paper http://hdl.handle.net/20.500.11937/28805 10.1007/978-3-540-88411-8_26 Springer fulltext
spellingShingle Hadzic, Fedja
Tan, Henry
Dillon, Tharam S.
Mining unordered distance-constrained embedded subtrees
title Mining unordered distance-constrained embedded subtrees
title_full Mining unordered distance-constrained embedded subtrees
title_fullStr Mining unordered distance-constrained embedded subtrees
title_full_unstemmed Mining unordered distance-constrained embedded subtrees
title_short Mining unordered distance-constrained embedded subtrees
title_sort mining unordered distance-constrained embedded subtrees
url http://hdl.handle.net/20.500.11937/28805