Missing Data in Surgical Data Sets: A Review of Pertinent Issues and Solutions

Sherene E Sharath; Nader Zamani; Panos Kougias; Soeun Kim

doi:10.1016/j.jss.2018.06.034

Missing Data in Surgical Data Sets: A Review of Pertinent Issues and Solutions

J Surg Res. 2018 Dec:232:240-246. doi: 10.1016/j.jss.2018.06.034. Epub 2018 Jul 13.

Authors

Sherene E Sharath¹, Nader Zamani¹, Panos Kougias¹, Soeun Kim²

Affiliations

¹ Division of Vascular Surgery and Endovascular Therapy, Michael E. DeBakey Department of Surgery, Baylor College of Medicine/Michael E. DeBakey Veterans Affairs Medical Center, Houston, Texas.
² Department of Biostatistics and Data Science, University of Texas Health Science Center - School of Public Health, Houston, Texas. Electronic address: Soeun.S.Kim@gmail.com.

PMID: 30463724
DOI: 10.1016/j.jss.2018.06.034

Abstract

Incomplete data is a common problem in research studies. Methods to address missing observations in a data set have been extensively researched and described. Disseminating these methods to the greater research community is an ongoing effort. In this article, we describe some of the basic principles of missing data and identify practical, commonly used methods of adjustment relevant to surgical data sets. Through an example data set, we compare models generated through complete case analysis, single imputation (SI), and multiple imputation (MI). We also provide information on the steps to conduct MI using Stata IC. In our comparisons, we found that differences in odds ratios were greatest between the results from complete case analysis compared to the SI and MI models indicating that in this case the reduction in statistical power has a non-negligible effect on the parameter estimates. Odds ratio estimates from the SI and MI methods were largely similar. In some instances, when compared to the MI method, the SI method tended to overestimate effect sizes. While in this example the differences in odds ratios do not vary greatly between the SI and MI methods, there are clear indications supporting the use of MI over SI. By describing the issues surrounding missing data and the available options for adjustment, we hope to encourage the use of robust imputation methods for missing observations.

Keywords: Complete case analysis; Missing data; Multiple imputation; Single imputation; Statistical methodology.

Publication types

Review

MeSH terms

Datasets as Topic*
Humans
Software
Surgical Procedures, Operative* / mortality
Vascular Surgical Procedures / mortality