Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics

Int J Health Geogr. 2016 Aug 3;15(1):27. doi: 10.1186/s12942-016-0056-6.

Abstract

Background: Spatial and space-time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report.

Methods: The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters.

Results: The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant.

Conclusions: The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 ( www.satscan.org ).

Keywords: Cancer mortality; Cluster detection; Cluster reporting size; Disease surveillance; Gini coefficient; Log likelihood ratio; SaTScan; Scan statistic; Spatial statistics.

MeSH terms

  • Humans
  • Models, Statistical*
  • Public Health Surveillance / methods*
  • Research Design
  • Spatial Analysis*