Crowd-sourced data collection to support automatic classification of building footprint data
Keywords: Crowdsourcing, Building classification, Land Use Mapping, Sustainable Development
Abstract. Human settlements are mainly formed by buildings with their different characteristics and usage. Despite the importance of buildings for the economy and society, complete regional or even national figures of the entire building stock and its spatial distribution are still hardly available. Available digital topographic data sets created by National Mapping Agencies or mapped voluntarily through a crowd via Volunteered Geographic Information (VGI) platforms (e.g. OpenStreetMap) contain building footprint information but often lack additional information on building type, usage, age or number of floors. For this reason, predictive modeling is becoming increasingly important in this context. The capabilities of machine learning allow for the prediction of building types and other building characteristics and thus, the efficient classification and description of the entire building stock of cities and regions. However, such data-driven approaches always require a sufficient amount of ground truth (reference) information for training and validation. The collection of reference data is usually cost-intensive and time-consuming. Experiences from other disciplines have shown that crowdsourcing offers the possibility to support the process of obtaining ground truth data. Therefore, this paper presents the results of an experimental study aiming at assessing the accuracy of non-expert annotations on street view images collected from an internet crowd. The findings provide the basis for a future integration of a crowdsourcing component into the process of land use mapping, particularly the automatic building classification.