Digitising Scotland — automated geocoding framework
Digitising Scotland
AcademiaUniversity of St Andrews / National Records of Scotland
2014–2016
PythonGISPostGISHistorical DataGeocodingESRC
Project Description
Digitising Scotland was an ESRC-funded project at the University of St Andrews in partnership with the National Records of Scotland (NRS). The goal was the large-scale digitisation and spatial referencing of approximately 24 million Scottish vital event records (births, deaths, and marriages) covering the period 1855–1974.
My primary contribution was the design and development of the HAG-GIS (Historical Address Geocoder — GIS), a fully automated Python-based spatial processing framework capable of geocoding historical addresses from nineteenth and twentieth century Scotland to modern-day georeferenced locations.
Key Contributions
- Designed the end-to-end automated geocoding pipeline
- Developed address parsing and normalisation routines for pre-standardised historical text
- Built a spatial matching engine using PostGIS and Python to link records to census enumeration districts and modern administrative boundaries
- Handled edge cases including address ambiguity, name variation over time, and boundary changes across 120 years
Skills Used
- Python 2.7 / 3.x
- PostGIS / PostgreSQL
- ESRI ArcGIS / QGIS
- Spatial analysis and address matching
- Large-scale data processing