A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds
Most of the clustering methods used in the clustering of chemical structures such as Wards, Group Average, K- means and Jarvis-Patrick, are known as hard or crisp as they partition a dataset into strictly disjoint subsets; and thus are not suitable for the clustering of chemical structures exhibitin...
| Main Authors: | , |
|---|---|
| Format: | Book Section |
| Language: | English |
| Published: |
Springer Berlin / Heidelberg
2007
|
| Subjects: | |
| Online Access: | http://eprints.utm.my/9630/ http://eprints.utm.my/9630/1/NaomieSalim2007_ASoftHierarchicalAlgorithm.pdf |
| _version_ | 1848891898072137728 |
|---|---|
| author | Salim, Naomie Shah, J. Z. |
| author_facet | Salim, Naomie Shah, J. Z. |
| author_sort | Salim, Naomie |
| building | UTeM Institutional Repository |
| collection | Online Access |
| description | Most of the clustering methods used in the clustering of chemical structures such as Wards, Group Average, K- means and Jarvis-Patrick, are known as hard or crisp as they partition a dataset into strictly disjoint subsets; and thus are not suitable for the clustering of chemical structures exhibiting more than one activity. Although, fuzzy clustering algorithms such as fuzzy c-means provides an inherent mechanism for the clustering of overlapping structures (objects) but this potential of the fuzzy methods which comes from its fuzzy membership functions have not been utilized effectively. In this work a fuzzy hierarchical algorithm is developed which provides a mechanism not only to benefit from the fuzzy clustering process but also to get advantage of the multiple membership function of the fuzzy clustering. The algorithm divides each and every cluster, if its size is larger than a pre-determined threshold, into two sub clusters based on the membership values of each structure. A structure is assigned to one or both the clusters if its membership value is very high or very similar respectively. The performance of the algorithm is evaluated on two bench mark datasets and a large dataset of compound structures derived from MDL MDDR database. The results of the algorithm show significant improvement in comparison to a similar implementation of the hard c-means algorithm. |
| first_indexed | 2025-11-15T21:05:16Z |
| format | Book Section |
| id | utm-9630 |
| institution | Universiti Teknologi Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T21:05:16Z |
| publishDate | 2007 |
| publisher | Springer Berlin / Heidelberg |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | utm-96302017-09-03T10:00:53Z http://eprints.utm.my/9630/ A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds Salim, Naomie Shah, J. Z. QA75 Electronic computers. Computer science Most of the clustering methods used in the clustering of chemical structures such as Wards, Group Average, K- means and Jarvis-Patrick, are known as hard or crisp as they partition a dataset into strictly disjoint subsets; and thus are not suitable for the clustering of chemical structures exhibiting more than one activity. Although, fuzzy clustering algorithms such as fuzzy c-means provides an inherent mechanism for the clustering of overlapping structures (objects) but this potential of the fuzzy methods which comes from its fuzzy membership functions have not been utilized effectively. In this work a fuzzy hierarchical algorithm is developed which provides a mechanism not only to benefit from the fuzzy clustering process but also to get advantage of the multiple membership function of the fuzzy clustering. The algorithm divides each and every cluster, if its size is larger than a pre-determined threshold, into two sub clusters based on the membership values of each structure. A structure is assigned to one or both the clusters if its membership value is very high or very similar respectively. The performance of the algorithm is evaluated on two bench mark datasets and a large dataset of compound structures derived from MDL MDDR database. The results of the algorithm show significant improvement in comparison to a similar implementation of the hard c-means algorithm. Springer Berlin / Heidelberg 2007-05 Book Section PeerReviewed application/pdf en http://eprints.utm.my/9630/1/NaomieSalim2007_ASoftHierarchicalAlgorithm.pdf Salim, Naomie and Shah, J. Z. (2007) A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds. In: Bioinformatics Research and Development. Springer Berlin / Heidelberg, pp. 140-153. ISBN 978-3-540-71232-9 http://dx.doi.org/10.1007/978-3-540-71233-6_12 doi : 10.1007/978-3-540-71233-6_12 |
| spellingShingle | QA75 Electronic computers. Computer science Salim, Naomie Shah, J. Z. A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| title | A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| title_full | A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| title_fullStr | A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| title_full_unstemmed | A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| title_short | A soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| title_sort | soft hierarchical algorithm for the clustering of multiple bioactive chemical compounds |
| topic | QA75 Electronic computers. Computer science |
| url | http://eprints.utm.my/9630/ http://eprints.utm.my/9630/ http://eprints.utm.my/9630/ http://eprints.utm.my/9630/1/NaomieSalim2007_ASoftHierarchicalAlgorithm.pdf |