Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression

Assessing the fit of a model is an important final step in any statistical analysis, but this is not straightforward when complex discrete response models are used. Cross validation and posterior predictions have been suggested as methods to aid model criticism. In this paper a comparison is made be...

Full description

Bibliographic Details
Main Authors: Green, Martin J., Medley, Graham F., Browne, William J.
Format: Article
Published: EDP Sciences 2009
Subjects:
Online Access:https://eprints.nottingham.ac.uk/1273/
_version_ 1848790574269726720
author Green, Martin J.
Medley, Graham F.
Browne, William J.
author_facet Green, Martin J.
Medley, Graham F.
Browne, William J.
author_sort Green, Martin J.
building Nottingham Research Data Repository
collection Online Access
description Assessing the fit of a model is an important final step in any statistical analysis, but this is not straightforward when complex discrete response models are used. Cross validation and posterior predictions have been suggested as methods to aid model criticism. In this paper a comparison is made between four methods of model predictive assessment in the context of a three level logistic regression model for clinical mastitis in dairy cattle; cross validation, a prediction using the full posterior predictive distribution and two “mixed” predictive methods that incorporate higher level random effects simulated from the underlying model distribution. Cross validation is considered a gold standard method but is computationally intensive and thus a comparison is made between posterior predictive assessments and cross validation. The analyses revealed that mixed prediction methods produced results close to cross validation whilst the full posterior predictive assessment gave predictions that were over-optimistic (closer to the observed disease rates) compared with cross validation. A mixed prediction method that simulated random effects from both higher levels was best at identifying the outlying level two (farm-year) units of interest. It is concluded that this mixed prediction method, simulating random effects from both higher levels, is straightforward and may be of value in model criticism of multilevel logistic regression, a technique commonly used for animal health data with a hierarchical structure.
first_indexed 2025-11-14T18:14:47Z
format Article
id nottingham-1273
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T18:14:47Z
publishDate 2009
publisher EDP Sciences
recordtype eprints
repository_type Digital Repository
spelling nottingham-12732020-05-04T20:26:49Z https://eprints.nottingham.ac.uk/1273/ Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression Green, Martin J. Medley, Graham F. Browne, William J. Assessing the fit of a model is an important final step in any statistical analysis, but this is not straightforward when complex discrete response models are used. Cross validation and posterior predictions have been suggested as methods to aid model criticism. In this paper a comparison is made between four methods of model predictive assessment in the context of a three level logistic regression model for clinical mastitis in dairy cattle; cross validation, a prediction using the full posterior predictive distribution and two “mixed” predictive methods that incorporate higher level random effects simulated from the underlying model distribution. Cross validation is considered a gold standard method but is computationally intensive and thus a comparison is made between posterior predictive assessments and cross validation. The analyses revealed that mixed prediction methods produced results close to cross validation whilst the full posterior predictive assessment gave predictions that were over-optimistic (closer to the observed disease rates) compared with cross validation. A mixed prediction method that simulated random effects from both higher levels was best at identifying the outlying level two (farm-year) units of interest. It is concluded that this mixed prediction method, simulating random effects from both higher levels, is straightforward and may be of value in model criticism of multilevel logistic regression, a technique commonly used for animal health data with a hierarchical structure. EDP Sciences 2009 Article PeerReviewed Green, Martin J., Medley, Graham F. and Browne, William J. (2009) Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression. Veterinary Research, 40 (4). Article 30. ISSN 0928-4249 model fit posterior predictive assessment mixed predictive assessment cross validation Bayesian multilevel model http://dx.doi.org/10.1051/vetres/2009013 doi:10.1051/vetres/2009013 doi:10.1051/vetres/2009013
spellingShingle model fit
posterior predictive assessment
mixed predictive assessment
cross validation
Bayesian multilevel model
Green, Martin J.
Medley, Graham F.
Browne, William J.
Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
title Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
title_full Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
title_fullStr Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
title_full_unstemmed Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
title_short Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
title_sort use of posterior predictive assessments to evaluate model fit in multilevel logistic regression
topic model fit
posterior predictive assessment
mixed predictive assessment
cross validation
Bayesian multilevel model
url https://eprints.nottingham.ac.uk/1273/
https://eprints.nottingham.ac.uk/1273/
https://eprints.nottingham.ac.uk/1273/