Mining jams into pollution: how Waze data helps estimating air pollution in large cities
MetadataShow full item record
Air pollution has been a growing worry of international medical organizations and governments due its relation with a large number of respiratory diseases, among other effects. This work proposes an open, crowdsourced, scalable methodology to model spatial air pollution in cities worldwide. We use both Waze and Open Street Maps data to construct a collection of features aimed to model car emissions in (large) cities. Waze data carries information about all jammed road segments of a region for every two minutes and Open Street Maps (OSM) is an open source, detailed, dynamically updated, spatial database of mapped features. Our model is trained using data from a 30 sq km region of Oakland, California in the United States of America. The dependent variables are the annual concentration of fine grained black carbon, nitric oxide, and nitrogen dioxide. The features are aggregated in hexagons with a 173 meters edge. We notice that pollutant concentration between hexagons follows a power law and high concentration is associated with the presence of highways. We estimate four models: simple linear regression where the only feature is the presence of a highway in the hexagon, multiple linear regression, random forest, and XGBoost. The latter yields better results in the validation set for black carbon, NO and NO2. Finally, we extrapolate the model for Montevideo, Uruguay and observe adherence to what is expected in practice.