IRIS - Res&Arch Institutional Research Information System - Research & Archive

Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. Crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. We present the first dataset of data to apply a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual: Transfer learning techniques can be used on a neural network, novel or pre-trained on low-level features using extensive datasets of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which to fine-tune the domain-specific features. This dataset includes the complete data of the study, to reproduce each step.

Emotional Crowd Sound

Valentina Franzoni^Supervision;Giulio Biondi^{Membro del Collaboration Group};Alfredo Milani^{Membro del Collaboration Group}

2021

Abstract

Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. Crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. We present the first dataset of data to apply a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual: Transfer learning techniques can be used on a neural network, novel or pre-trained on low-level features using extensive datasets of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which to fine-tune the domain-specific features. This dataset includes the complete data of the study, to reproduce each step.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2021

Appare nelle tipologie:

5.11 Banca dati

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1575915

Citazioni

ND

ND

ND

social impact