Automatic Georeferencing of Topographic Map Sheets Using OpenCV and Tesseract
Keywords: automatic georeferencing, OpenCV, Python, Tesseract, GDAL
Abstract. The authors developed a pipeline for the automatic georeferencing of older 1 : 25 000 topographic map sheets of Hungary. The first step is the detection of the corners of the map content, then the recognition of the sheet identifier. These maps depict geographic quadrangles whose extent can be derived from the sheet ID. The sheet corners are used as GCPs for the georeference.
The whole process is implemented in Python, using various open source libraries: OpenCV for image processing, Tesseract for OCR and GDAL for georeferencing.
1147 map sheets were processed with an average speed of 4 seconds per sheet. False detection of the corners is automatically filtered by geometric analysis of the detected GCPs, while the sheet IDs are validated using regular expressions. The error of corner detection is under 1% of the sheet size for 89% of the sheets, under 2% for 99%. The sheet ID recognition success rate is 75.9%.
Although the system is finetuned to a specific map series, it can be easily adapted to any other map series having approximately rectangular frame.