Missing Data in Surgical Data Sets: A Review of Pertinent Issues and Solutions

J Surg Res. 2018 Dec:232:240-246. doi: 10.1016/j.jss.2018.06.034. Epub 2018 Jul 13.

Abstract

Incomplete data is a common problem in research studies. Methods to address missing observations in a data set have been extensively researched and described. Disseminating these methods to the greater research community is an ongoing effort. In this article, we describe some of the basic principles of missing data and identify practical, commonly used methods of adjustment relevant to surgical data sets. Through an example data set, we compare models generated through complete case analysis, single imputation (SI), and multiple imputation (MI). We also provide information on the steps to conduct MI using Stata IC. In our comparisons, we found that differences in odds ratios were greatest between the results from complete case analysis compared to the SI and MI models indicating that in this case the reduction in statistical power has a non-negligible effect on the parameter estimates. Odds ratio estimates from the SI and MI methods were largely similar. In some instances, when compared to the MI method, the SI method tended to overestimate effect sizes. While in this example the differences in odds ratios do not vary greatly between the SI and MI methods, there are clear indications supporting the use of MI over SI. By describing the issues surrounding missing data and the available options for adjustment, we hope to encourage the use of robust imputation methods for missing observations.

Keywords: Complete case analysis; Missing data; Multiple imputation; Single imputation; Statistical methodology.

Publication types

  • Review

MeSH terms

  • Datasets as Topic*
  • Humans
  • Software
  • Surgical Procedures, Operative* / mortality
  • Vascular Surgical Procedures / mortality