Review of methods for handling confounding by cluster and informative cluster size in clustered data

Clustered data are common in medical research. Typically, one is interested in a regression model for the association between an outcome and covariates. Two complications that can arise when analysing clustered data are informative cluster size (ICS) and confounding by cluster (CBC). ICS and CBC mea...

Full description

Bibliographic Details
Main Authors: Seaman, Shaun, Pavlou, Menelaos, Copas, Andrew
Format: Online
Language:English
Published: BlackWell Publishing Ltd 2014
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4320764/
id pubmed-4320764
recordtype oai_dc
spelling pubmed-43207642015-02-13 Review of methods for handling confounding by cluster and informative cluster size in clustered data Seaman, Shaun Pavlou, Menelaos Copas, Andrew Special Issue Papers Clustered data are common in medical research. Typically, one is interested in a regression model for the association between an outcome and covariates. Two complications that can arise when analysing clustered data are informative cluster size (ICS) and confounding by cluster (CBC). ICS and CBC mean that the outcome of a member given its covariates is associated with, respectively, the number of members in the cluster and the covariate values of other members in the cluster. Standard generalised linear mixed models for cluster-specific inference and standard generalised estimating equations for population-average inference assume, in general, the absence of ICS and CBC. Modifications of these approaches have been proposed to account for CBC or ICS. This article is a review of these methods. We express their assumptions in a common format, thus providing greater clarity about the assumptions that methods proposed for handling CBC make about ICS and vice versa, and about when different methods can be used in practice. We report relative efficiencies of methods where available, describe how methods are related, identify a previously unreported equivalence between two key methods, and propose some simple additional methods. Unnecessarily using a method that allows for ICS/CBC has an efficiency cost when ICS and CBC are absent. We review tools for identifying ICS/CBC. A strategy for analysis when CBC and ICS are suspected is demonstrated by examining the association between socio-economic deprivation and preterm neonatal death in Scotland. BlackWell Publishing Ltd 2014-12-30 2014-08-04 /pmc/articles/PMC4320764/ /pubmed/25087978 http://dx.doi.org/10.1002/sim.6277 Text en © The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. http://creativecommons.org/licenses/by/3.0/ This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Seaman, Shaun
Pavlou, Menelaos
Copas, Andrew
spellingShingle Seaman, Shaun
Pavlou, Menelaos
Copas, Andrew
Review of methods for handling confounding by cluster and informative cluster size in clustered data
author_facet Seaman, Shaun
Pavlou, Menelaos
Copas, Andrew
author_sort Seaman, Shaun
title Review of methods for handling confounding by cluster and informative cluster size in clustered data
title_short Review of methods for handling confounding by cluster and informative cluster size in clustered data
title_full Review of methods for handling confounding by cluster and informative cluster size in clustered data
title_fullStr Review of methods for handling confounding by cluster and informative cluster size in clustered data
title_full_unstemmed Review of methods for handling confounding by cluster and informative cluster size in clustered data
title_sort review of methods for handling confounding by cluster and informative cluster size in clustered data
description Clustered data are common in medical research. Typically, one is interested in a regression model for the association between an outcome and covariates. Two complications that can arise when analysing clustered data are informative cluster size (ICS) and confounding by cluster (CBC). ICS and CBC mean that the outcome of a member given its covariates is associated with, respectively, the number of members in the cluster and the covariate values of other members in the cluster. Standard generalised linear mixed models for cluster-specific inference and standard generalised estimating equations for population-average inference assume, in general, the absence of ICS and CBC. Modifications of these approaches have been proposed to account for CBC or ICS. This article is a review of these methods. We express their assumptions in a common format, thus providing greater clarity about the assumptions that methods proposed for handling CBC make about ICS and vice versa, and about when different methods can be used in practice. We report relative efficiencies of methods where available, describe how methods are related, identify a previously unreported equivalence between two key methods, and propose some simple additional methods. Unnecessarily using a method that allows for ICS/CBC has an efficiency cost when ICS and CBC are absent. We review tools for identifying ICS/CBC. A strategy for analysis when CBC and ICS are suspected is demonstrated by examining the association between socio-economic deprivation and preterm neonatal death in Scotland.
publisher BlackWell Publishing Ltd
publishDate 2014
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4320764/
_version_ 1613185515492737024