Data Cleansing Method for Sparse Trajectory Data: A Case Study of Shared Electric Bicycles in Tengzhou
Keywords: Shared e-bike, GPS trajectory data, recognition of O-D pairs, sparse data, Tengzhou
Abstract. Location based service (LBS) technologies provides a new perspective for the spatiotemporal dynamics analysis of urban systems. Previous studies have been performed by using data of mobile communications, public transport vehicles (taxis and buses), wireless hotspots and shared bicycles. However, the analysis based on shared electric bicycles (e-bike) has yet to be studied in the literature. Data cleansing and the extraction of origin-destination (O-D) are prerequisites for the study of urban systems spatiotemporal patterns. In this study, based on a dataset that contains a week of shared e-bike GPS data in Tengzhou City (Shandong Province), sparse characteristics of discontinuities and non-uniformities of trajectory GPS and a lack of riding status are captured. Based on the characteristics and combining with the actual road, we proposed a method for the extraction of O-D pairs for every trajectory segments from continuous and stateless trajectory GPS data. This method cleans the incomplete and invalid trajectory records, which is suitable for sparse trajectory data. Finally, a week-long shared e-bike GPS data in Tengzhou City is scrubbed, and by sampling method, the extraction accuracy of 91% is verified. In summary, we provide a preliminary cleansing rules for the sparse trajectory data of shared e-bikes at the first time, which is highly reliable, and is suitable for data mining from other forms of sparse GPS trajectory data.