A note on utilising binary features as ligand descriptors

It is common in cheminformatics to represent the properties of a ligand as a string of 1’s and 0’s, with the intention of elucidating, inter alia, the relationship between the chemical structure of a ligand and its bioactivity. In this commentary we note that, where relevant but non-redundant featur...

Full description

Bibliographic Details
Main Authors: Mussa, Hamse Y., Mitchell, John B. O., Glen, Robert C.
Format: Online
Language:English
Published: Springer International Publishing 2015
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4665894/
id pubmed-4665894
recordtype oai_dc
spelling pubmed-46658942015-12-02 A note on utilising binary features as ligand descriptors Mussa, Hamse Y. Mitchell, John B. O. Glen, Robert C. Commentary It is common in cheminformatics to represent the properties of a ligand as a string of 1’s and 0’s, with the intention of elucidating, inter alia, the relationship between the chemical structure of a ligand and its bioactivity. In this commentary we note that, where relevant but non-redundant features are binary, they inevitably lead to a classifier capable of capturing only a linear relationship between structural features and activity. If, instead, we were to use relevant but non-redundant real-valued features, the resulting predictive model would be capable of describing a non-linear structure-activity relationship. Hence, we suggest that real-valued features, where available, are to be preferred in this scenario. Springer International Publishing 2015-12-01 /pmc/articles/PMC4665894/ /pubmed/26628925 http://dx.doi.org/10.1186/s13321-015-0105-3 Text en © Mussa et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Mussa, Hamse Y.
Mitchell, John B. O.
Glen, Robert C.
spellingShingle Mussa, Hamse Y.
Mitchell, John B. O.
Glen, Robert C.
A note on utilising binary features as ligand descriptors
author_facet Mussa, Hamse Y.
Mitchell, John B. O.
Glen, Robert C.
author_sort Mussa, Hamse Y.
title A note on utilising binary features as ligand descriptors
title_short A note on utilising binary features as ligand descriptors
title_full A note on utilising binary features as ligand descriptors
title_fullStr A note on utilising binary features as ligand descriptors
title_full_unstemmed A note on utilising binary features as ligand descriptors
title_sort note on utilising binary features as ligand descriptors
description It is common in cheminformatics to represent the properties of a ligand as a string of 1’s and 0’s, with the intention of elucidating, inter alia, the relationship between the chemical structure of a ligand and its bioactivity. In this commentary we note that, where relevant but non-redundant features are binary, they inevitably lead to a classifier capable of capturing only a linear relationship between structural features and activity. If, instead, we were to use relevant but non-redundant real-valued features, the resulting predictive model would be capable of describing a non-linear structure-activity relationship. Hence, we suggest that real-valued features, where available, are to be preferred in this scenario.
publisher Springer International Publishing
publishDate 2015
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4665894/
_version_ 1613508306653937664