Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. Crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. We present the first dataset of data to apply a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual: Transfer learning techniques can be used on a neural network, novel or pre-trained on low-level features using extensive datasets of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which to fine-tune the domain-specific features. This dataset includes the complete data of the study, to reproduce each step.

Emotional Crowd Sound

Valentina Franzoni
Supervision
;
Giulio Biondi
Membro del Collaboration Group
;
Alfredo Milani
Membro del Collaboration Group
2021

Abstract

Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. Crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. We present the first dataset of data to apply a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual: Transfer learning techniques can be used on a neural network, novel or pre-trained on low-level features using extensive datasets of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which to fine-tune the domain-specific features. This dataset includes the complete data of the study, to reproduce each step.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1575915
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact