Human vs. Machine: Comparing Urban Condition Classification Methods from Vehicular Vision

Nov 1, 2025·
Julia Gersey
,
Troy Zhong
,
Jiale Zhang
,
Jesse Codling
,
Jackelyn Hwang
,
Pei Zhang
· 1 min read
Abstract
Urban street conditions such as cleanliness and infrastructure decay affect public health and community well-being, yet monitoring often relies on subjective human judgment. Existing machine learning approaches narrowly target single factors like trash detection or road damage and require large datasets or costly pipelines. To balance task generalization with limited data, we propose a lightweight, interpretable pipeline for holistic condition assessment from street-level imagery. Using 325 images from seven U.S. cities with ten annotators per image, we find conventional models plateau at ~ 57% accuracy, but subjectivity-aware cleaning raises performance to ~ 62%, surpassing the weakest human annotator. Our contribution is empirical and methodological: we show that explicitly modeling annotator subjectivity can yield more reliable gains in handling noisy, subjective datasets than increasing algorithmic complexity, positioning lightweight pipelines as practical first-pass classifiers for scalable urban monitoring.
Type
Publication
Proceedings of the 12th ACM International Conference on Systems for Energy-efficient Buildings, Cities, and Transportation

Add the full text or supplementary notes for the publication here using Markdown formatting.