Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation

Background: The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignori...

Full description

Bibliographic Details
Main Authors: Malatesta, S., Weir, I.R., Weber, S.E., Bouton, T.C., Carney, T., Theron, D., Myers-Franchi, Bronwyn, Horsburgh, C.R., Warren, R.M., Jacobson, K.R., White, L.F.
Format: Journal Article
Language:English
Published: 2022
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/89766
_version_ 1848765282033598464
author Malatesta, S.
Weir, I.R.
Weber, S.E.
Bouton, T.C.
Carney, T.
Theron, D.
Myers-Franchi, Bronwyn
Horsburgh, C.R.
Warren, R.M.
Jacobson, K.R.
White, L.F.
author_facet Malatesta, S.
Weir, I.R.
Weber, S.E.
Bouton, T.C.
Carney, T.
Theron, D.
Myers-Franchi, Bronwyn
Horsburgh, C.R.
Warren, R.M.
Jacobson, K.R.
White, L.F.
author_sort Malatesta, S.
building Curtin Institutional Repository
collection Online Access
description Background: The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignoring missing data or simple carry-forward techniques. Statistically advanced multiple imputation methods potentially decrease bias and retain sample size and statistical power. Methods: We analyzed data from 261 participants who provided weekly sputa for the first 12 weeks of tuberculosis treatment. We compared methods for handling missing data points in a longitudinal study with a time-to-event outcome. Our primary outcome was time to culture conversion, defined as two consecutive weeks with no Mycobacterium tuberculosis growth. Methods used to address missing data included: 1) available case analysis, 2) last observation carried forward, and 3) multiple imputation by fully conditional specification. For each method, we calculated the proportion culture converted and used survival analysis to estimate Kaplan-Meier curves, hazard ratios, and restricted mean survival times. We compared methods based on point estimates, confidence intervals, and conclusions to specific research questions. Results: The three missing data methods lead to differences in the number of participants achieving conversion; 78 (32.8%) participants converted with available case analysis, 154 (64.7%) converted with last observation carried forward, and 184 (77.1%) converted with multiple imputation. Multiple imputation resulted in smaller point estimates than simple approaches with narrower confidence intervals. The adjusted hazard ratio for smear negative participants was 3.4 (95% CI 2.3, 5.1) using multiple imputation compared to 5.2 (95% CI 3.1, 8.7) using last observation carried forward and 5.0 (95% CI 2.4, 10.6) using available case analysis. Conclusion: We showed that accounting for missing sputum data through multiple imputation, a statistically valid approach under certain conditions, can lead to different conclusions than naïve methods. Careful consideration for how to handle missing data must be taken and be pre-specified prior to analysis. We used data from a TB study to demonstrate these concepts, however, the methods we described are broadly applicable to longitudinal missing data. We provide valuable statistical guidance and code for researchers to appropriately handle missing data in longitudinal studies.
first_indexed 2025-11-14T11:32:46Z
format Journal Article
id curtin-20.500.11937-89766
institution Curtin University Malaysia
institution_category Local University
language eng
last_indexed 2025-11-14T11:32:46Z
publishDate 2022
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-897662023-01-31T01:38:57Z Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation Malatesta, S. Weir, I.R. Weber, S.E. Bouton, T.C. Carney, T. Theron, D. Myers-Franchi, Bronwyn Horsburgh, C.R. Warren, R.M. Jacobson, K.R. White, L.F. Culture conversion Longitudinal missing data Multiple imputation Survival analysis Tuberculosis Humans Longitudinal Studies Data Interpretation, Statistical Sputum Bias Research Design Background: The occurrence and timing of mycobacterial culture conversion is used as a proxy for tuberculosis treatment response. When researchers serially sample sputum during tuberculosis studies, contamination or missed visits leads to missing data points. Traditionally, this is managed by ignoring missing data or simple carry-forward techniques. Statistically advanced multiple imputation methods potentially decrease bias and retain sample size and statistical power. Methods: We analyzed data from 261 participants who provided weekly sputa for the first 12 weeks of tuberculosis treatment. We compared methods for handling missing data points in a longitudinal study with a time-to-event outcome. Our primary outcome was time to culture conversion, defined as two consecutive weeks with no Mycobacterium tuberculosis growth. Methods used to address missing data included: 1) available case analysis, 2) last observation carried forward, and 3) multiple imputation by fully conditional specification. For each method, we calculated the proportion culture converted and used survival analysis to estimate Kaplan-Meier curves, hazard ratios, and restricted mean survival times. We compared methods based on point estimates, confidence intervals, and conclusions to specific research questions. Results: The three missing data methods lead to differences in the number of participants achieving conversion; 78 (32.8%) participants converted with available case analysis, 154 (64.7%) converted with last observation carried forward, and 184 (77.1%) converted with multiple imputation. Multiple imputation resulted in smaller point estimates than simple approaches with narrower confidence intervals. The adjusted hazard ratio for smear negative participants was 3.4 (95% CI 2.3, 5.1) using multiple imputation compared to 5.2 (95% CI 3.1, 8.7) using last observation carried forward and 5.0 (95% CI 2.4, 10.6) using available case analysis. Conclusion: We showed that accounting for missing sputum data through multiple imputation, a statistically valid approach under certain conditions, can lead to different conclusions than naïve methods. Careful consideration for how to handle missing data must be taken and be pre-specified prior to analysis. We used data from a TB study to demonstrate these concepts, however, the methods we described are broadly applicable to longitudinal missing data. We provide valuable statistical guidance and code for researchers to appropriately handle missing data in longitudinal studies. 2022 Journal Article http://hdl.handle.net/20.500.11937/89766 10.1186/s12874-022-01782-8 eng http://creativecommons.org/licenses/by/4.0/ fulltext
spellingShingle Culture conversion
Longitudinal missing data
Multiple imputation
Survival analysis
Tuberculosis
Humans
Longitudinal Studies
Data Interpretation, Statistical
Sputum
Bias
Research Design
Malatesta, S.
Weir, I.R.
Weber, S.E.
Bouton, T.C.
Carney, T.
Theron, D.
Myers-Franchi, Bronwyn
Horsburgh, C.R.
Warren, R.M.
Jacobson, K.R.
White, L.F.
Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_full Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_fullStr Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_full_unstemmed Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_short Methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
title_sort methods for handling missing data in serially sampled sputum specimens for mycobacterial culture conversion calculation
topic Culture conversion
Longitudinal missing data
Multiple imputation
Survival analysis
Tuberculosis
Humans
Longitudinal Studies
Data Interpretation, Statistical
Sputum
Bias
Research Design
url http://hdl.handle.net/20.500.11937/89766