Comparing adaptive and fixed bandwidth-based kernel density estimates in spatial cancer epidemiology

Int J Health Geogr. 2015 Mar 31:14:15. doi: 10.1186/s12942-015-0005-9.

Abstract

Background: Monitoring spatial disease risk (e.g. identifying risk areas) is of great relevance in public health research, especially in cancer epidemiology. A common strategy uses case-control studies and estimates a spatial relative risk function (sRRF) via kernel density estimation (KDE). This study was set up to evaluate the sRRF estimation methods, comparing fixed with adaptive bandwidth-based KDE, and how they were able to detect 'risk areas' with case data from a population-based cancer registry.

Methods: The sRRF were estimated within a defined area, using locational information on incident cancer cases and on a spatial sample of controls, drawn from a high-resolution population grid recognized as underestimating the resident population in urban centers. The spatial extensions of these areas with underestimated resident population were quantified with population reference data and used in this study as 'true risk areas'. Sensitivity and specificity analyses were conducted by spatial overlay of the 'true risk areas' and the significant (α=.05) p-contour lines obtained from the sRRF.

Results: We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban 'risk areas', that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator. In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator. Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator.

Conclusion: The fixed risk estimator contrasts with an oversmoothing tendency in urban areas, while overestimating the risk in rural areas. The use of an adaptive bandwidth regime attenuated this pattern, but led in general to a higher false positive rate, because, in our study design, the majority of true risk areas were located in urban areas. However, there is a strong need for further optimizing the bandwidth selection methods, especially for the adaptive sRRF.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Female
  • Germany / epidemiology
  • Humans
  • Male
  • Middle Aged
  • Neoplasms / diagnosis
  • Neoplasms / epidemiology*
  • Registries / statistics & numerical data
  • Risk Factors
  • Spatial Analysis*