Mining unordered distance-constrained embedded subtrees
Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance o...
| Main Authors: | , , |
|---|---|
| Other Authors: | |
| Format: | Conference Paper |
| Published: |
Springer
2008
|
| Online Access: | http://hdl.handle.net/20.500.11937/28805 |
| _version_ | 1848752634632077312 |
|---|---|
| author | Hadzic, Fedja Tan, Henry Dillon, Tharam S. |
| author2 | J-F. Boulicaut |
| author_facet | J-F. Boulicaut Hadzic, Fedja Tan, Henry Dillon, Tharam S. |
| author_sort | Hadzic, Fedja |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. Mining distance-constrained unordered embedded subtrees will have important applications in web information systems, conceptual model analysis and more sophisticated knowledge matching. An encoding strategy is presented to efficiently enumerate candidate unordered embedded subtrees taking the distance of nodes relative to the root of the subtree into account. Both synthetic and real-world datasets were used for experimental evaluation and discussion. |
| first_indexed | 2025-11-14T08:11:44Z |
| format | Conference Paper |
| id | curtin-20.500.11937-28805 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T08:11:44Z |
| publishDate | 2008 |
| publisher | Springer |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-288052022-11-21T06:47:08Z Mining unordered distance-constrained embedded subtrees Hadzic, Fedja Tan, Henry Dillon, Tharam S. J-F. Boulicaut M. R. Berthold T. Horv�th Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. Mining distance-constrained unordered embedded subtrees will have important applications in web information systems, conceptual model analysis and more sophisticated knowledge matching. An encoding strategy is presented to efficiently enumerate candidate unordered embedded subtrees taking the distance of nodes relative to the root of the subtree into account. Both synthetic and real-world datasets were used for experimental evaluation and discussion. 2008 Conference Paper http://hdl.handle.net/20.500.11937/28805 10.1007/978-3-540-88411-8_26 Springer fulltext |
| spellingShingle | Hadzic, Fedja Tan, Henry Dillon, Tharam S. Mining unordered distance-constrained embedded subtrees |
| title | Mining unordered distance-constrained embedded subtrees |
| title_full | Mining unordered distance-constrained embedded subtrees |
| title_fullStr | Mining unordered distance-constrained embedded subtrees |
| title_full_unstemmed | Mining unordered distance-constrained embedded subtrees |
| title_short | Mining unordered distance-constrained embedded subtrees |
| title_sort | mining unordered distance-constrained embedded subtrees |
| url | http://hdl.handle.net/20.500.11937/28805 |