[Who Hits the Mark? A Comparative Study of the Free Geocoding Services of Google and OpenStreetMap]

Gesundheitswesen. 2015 Sep;77(8-9):e160-5. doi: 10.1055/s-0035-1549939. Epub 2015 Jul 8.
[Article in German]

Abstract

Background: Geocoding, the process of converting textual information (addresses) into geographic coordinates is increasingly used in public health/epidemiological research and practice. To date, little attention has been paid to geocoding quality and its impact on different types of spatially-related health studies. The primary aim of this study was to compare 2 freely available geocoding services (Google and OpenStreetMap) with regard to matching rate (percentage of address records capable of being geocoded) and positional accuracy (distance between geocodes and the ground truth locations).

Methods: Residential addresses were geocoded by the NRW state office for information and technology and were considered as reference data (gold standard). The gold standard included the coordinates, the quality of the addresses (4 categories), and a binary urbanity indicator based on the CORINE land cover data. 2 500 addresses were randomly sampled after stratification for address quality and urbanity indicator (approximately 20 000 addresses). These address samples were geocoded using the geocoding services from Google and OSM.

Results: In general, both geocoding services showed a decrease in the matching rate with decreasing address quality and urbanity. Google showed consistently a higher completeness than OSM (>93 vs. >82%). Also, the cartographic confounding between urban and rural regions was less distinct with Google's geocoding API. Regarding the positional accuracy of the geo-coordinates, Google also showed the smallest deviations from the reference coordinates, with a median of <9 vs. <175.8 m. The cumulative density function derived from the positional accuracy showed for Google that nearly 95% and for OSM 50% of the addresses were geocoded within <50 m of their reference coordinates.

Conclusion: The geocoding API from Google is superior to OSM regarding completeness and positional accuracy of the geocoded addresses. On the other hand, Google has several restrictions, such as the limitation of the requests to 2 500 addresses per 24 h and the presentation of the results exclusively on Google Maps, which may complicate the use for scientific purposes.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Data Accuracy*
  • Geographic Information Systems / statistics & numerical data*
  • Geographic Mapping*
  • Germany
  • Meaningful Use / statistics & numerical data*
  • Natural Language Processing
  • Reproducibility of Results
  • Search Engine / statistics & numerical data*
  • Sensitivity and Specificity