Statistical tests for large tree-structured data
We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model fo...
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Published: |
Taylor & Francis
2017
|
| Online Access: | https://eprints.nottingham.ac.uk/40800/ |
| _version_ | 1848796136562753536 |
|---|---|
| author | Bharath, Karthik Kambadur, Prabhanjan Dey, Dipak. K. Rao, Arvind Baladandayuthapani, Veerabhadran |
| author_facet | Bharath, Karthik Kambadur, Prabhanjan Dey, Dipak. K. Rao, Arvind Baladandayuthapani, Veerabhadran |
| author_sort | Bharath, Karthik |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model for binary trees, we then construct tests for general trees by using the distributional properties of the Continuum Random Tree, which arises as the invariant limit for a broad class of models for tree-structured data based on conditioned Galton–Watson processes. The test statistics for the goodness-of-fit tests are simple to compute and are asymptotically distributed as χ2 and F random variables. We illustrate our methods on an important application of detecting tumour heterogeneity in brain cancer. We use a novel approach with tree-based representations of magnetic resonance images and employ the developed tests to ascertain tumor heterogeneity between two groups of patients. |
| first_indexed | 2025-11-14T19:43:11Z |
| format | Article |
| id | nottingham-40800 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| last_indexed | 2025-11-14T19:43:11Z |
| publishDate | 2017 |
| publisher | Taylor & Francis |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-408002020-05-04T18:59:39Z https://eprints.nottingham.ac.uk/40800/ Statistical tests for large tree-structured data Bharath, Karthik Kambadur, Prabhanjan Dey, Dipak. K. Rao, Arvind Baladandayuthapani, Veerabhadran We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model for binary trees, we then construct tests for general trees by using the distributional properties of the Continuum Random Tree, which arises as the invariant limit for a broad class of models for tree-structured data based on conditioned Galton–Watson processes. The test statistics for the goodness-of-fit tests are simple to compute and are asymptotically distributed as χ2 and F random variables. We illustrate our methods on an important application of detecting tumour heterogeneity in brain cancer. We use a novel approach with tree-based representations of magnetic resonance images and employ the developed tests to ascertain tumor heterogeneity between two groups of patients. Taylor & Francis 2017-08-07 Article PeerReviewed Bharath, Karthik, Kambadur, Prabhanjan, Dey, Dipak. K., Rao, Arvind and Baladandayuthapani, Veerabhadran (2017) Statistical tests for large tree-structured data. Journal of the American Statistical Association, 112 (520). pp. 1733-1743. ISSN 1537-274X http://www.tandfonline.com/doi/full/10.1080/01621459.2016.1240081 doi:10.1080/01621459.2016.1240081 doi:10.1080/01621459.2016.1240081 |
| spellingShingle | Bharath, Karthik Kambadur, Prabhanjan Dey, Dipak. K. Rao, Arvind Baladandayuthapani, Veerabhadran Statistical tests for large tree-structured data |
| title | Statistical tests for large tree-structured data |
| title_full | Statistical tests for large tree-structured data |
| title_fullStr | Statistical tests for large tree-structured data |
| title_full_unstemmed | Statistical tests for large tree-structured data |
| title_short | Statistical tests for large tree-structured data |
| title_sort | statistical tests for large tree-structured data |
| url | https://eprints.nottingham.ac.uk/40800/ https://eprints.nottingham.ac.uk/40800/ https://eprints.nottingham.ac.uk/40800/ |