Poster Abstract: Leveraging General-Purpose Audio Datasets for Vibration-Based Crowd Monitoring in Stadiums

May 1, 2025·

Yen-Cheng Chang

Jesse Codling

Yiwen Dong

Jiale Zhang

Jiasi Chen

Hae Young Noh

Pei Zhang

· 0 min read

Project DOI PDF

Abstract

Crowd monitoring in sports stadiums is important to enhance public safety and improve audience experience. Existing approaches mainly rely on cameras and microphones, which can cause significant disturbances and often raise privacy concerns. In this paper, we sense floor vibration, which provides a less disruptive and more non-intrusive way of crowd sensing, to predict crowd behavior. However, since the vibration-based crowd monitoring approach is newly developed, one main challenge is the lack of training data due to sports stadiums are usually large public spaces with complex physical activities. To overcome this challenge, we present Vibration Leverage Audio (ViLA), a vibration-based method that reduces the dependency on labeled data by pre-training with unlabeled cross-modality data. ViLA first pre-trains a model in an unsupervised manner using commonly available audio datasets and then fine-tunes the model with a small amount of labeled vibration data. Our real-world experiments demonstrate that pre-training the vibration model using publicly available audio data (YouTube8M) achieved up to a 4.5× accuracy improvement compared to the model without audio pre-training.

Type

Conference paper

Publication

Proceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems

Last updated on May 1, 2025

← GeoMCU: Adaptable and Resilient Low-Noise Sensing Platform for Structural Vibrations Sep 1, 2025

Poster Abstract: Multiscale Vibration Sensing for Activity and Vital Signs Monitoring in Pig Pens May 1, 2025 →

No results found

Poster Abstract: Leveraging General-Purpose Audio Datasets for Vibration-Based Crowd Monitoring in Stadiums