Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

Acoustic Event Detection

Above: An audio signal ; Below: Overlapping events in the above signal ; Source : DCASE 2016 publication, TUT Database for Acoustic Scene Classification and Sound Event Detection
Spectrogram of different, consecutive sound events


Sounds carry a lot of information in our everyday environment. Be it some footsteps outside your door, laughter from another room, a fire alarm, or incessant barking of a dog on the street, every audio event conveys a certain meaning and may have an important impact on its surroundings. Fortunately, we humans can usually recognize most of such events and act accordingly. However, the development of signal processing and machine learning methods to detect events embedded in acoustic signals is still in its nascent stages. Moreover, most techniques until now have focused on isolated sound events (monophonic) in noise-free synthetic environments whereas in reality most sound events occur in noisy conditions at overlapping time intervals (polyphonic). The latter, therefore makes this field of Acoustic Event Detection (AED) both a challenging and an interesting area of research.



AED finds its roots in numerous applications, such as searching for multimedia based on its audio content, intelligent monitoring and surveillance systems for traffic, home, biological parks. In addition, detecting the sound events using the audio can lead to a better understanding of the environment, giving rise to another important field of research,  Acoustic Scene Classification (ASC).

Current Work

A simple pipeline for training an event detector

Detection and Classification of Acoustic Scenes and Events (DCASE) is an official IEEE Audio and Acoustic Signal Processing (AASP) challenge that provides a database for AED in synthetic and real life environments.

Currently working with this small database (approx. 50 minutes) recorded in real 'home' and 'residential area' environments and focusing on Neural Network (NN) as a classifier, our work aims at enhancing accuracy for the task of polyphonic sound event detection in real life environments. This includes the following:

  1. Detect the occurrence of events in long audio recordings 
  2. Determine the onset and offset time stamps of the detected event




Prerna Arora

Nachrichtentechnik (NT)

Forschung & Lehre

Prerna Arora
+49 5251 60-5288
+49 5251 60-3627

Die Universität der Informationsgesellschaft