Mining multiple sources of historical data: The example of a standardized dataset of medieval monasteries and convents in France
Keywords: Christian monasteries and convents, religious orders, spatiotemporal dataset, data mining, data integration
Abstract. In this paper, we present a dataset of medieval monasteries and convents on the territory of today’s France and discuss the workflow of its integration. Spatial historical data are usually dispersed and stored in various forms – encyclopedias and catalogues, websites, online databases, and printed maps. In order to cope with this heterogeneity and proceed to computational analysis, we have devised a method that includes the creation of a data model, data mining from sources, data transformation, geocoding, editing, and conflicts solving.
The resulting dataset is probably the most comprehensive collection of records on medieval monasteries within the borders of today’s France. It can be used for understanding the spatial patterns of medieval Christian monasticism and the implantation of the official Church infrastructure, as well as the relation between this official infrastructure and phenomena covered in other datasets. We open this dataset, as well as scripts for mining, to the public (https://github.com/adammertel/dissinet.monasteries) and provide a map tool to visualize, filter, and download the records (http://hde.geogr.muni.cz/monasteries).