Forschung - Nachrichtentechnik (NT) | Universität Paderborn

Loose Coupling of Spectral and Spatial Models for Multi-Channel Diarization and Enhancement of Meetings in Dynamic Environments

A.T. Meise, T. Cord-Landwehr, C. Boeddeker, M. Delcroix, T. Nakatani, R. Haeb-Umbach, in: ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2026.

DOI arXiv

Speech Synthesis along Perceptual Voice Quality Dimensions

F. Rautenberg, M. Kuhlmann, F. Seebauer, J. Wiechmann, P. Wagner, R. Haeb-Umbach, in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2025.

DOI

Spatio-spectral diarization of meetings by combining TDOA-based segmentation and speaker embedding-based clustering

T. Cord-Landwehr, T. Gburrek, M. Deegen, R. Haeb-Umbach, in: Proceedings of INTERSPEECH, 2025.

DOI PDF arXiv

Towards Frame-level Quality Predictions of Synthetic Speech

M. Kuhlmann, F. Seebauer, P. Wagner, R. Häb-Umbach, in: Interspeech 2025, ISCA, 2025.

DOI

A Fully Zero-Shot Approach to Obtaining Specialized and Compact Audio Tagging Models

A. Werning, R. Häb-Umbach, in: S. Möller, T. Gerkmann, D. Kolossa (Eds.), Proceedings of the 16th ITG Conference on Speech Communication, Berlin, 2025, pp. 76–80.

Distilling Efficient Audio Models using Data Pruning with CLAP

A. Werning, R. Häb-Umbach, in: Deutsche Gesellschaft für Akustik e.V. (DEGA), Berlin, 2025 (Ed.), Proceedings of DAS|DAGA 2025, Copenhagen, 2025.

On the Application of Diffusion Models for Simultaneous Denoising and Dereverberation

A.T. Meise, T. Cord-Landwehr, R. Haeb-Umbach, in: ITG Conference on Speech Communication, 2025.

Synthesizing Speech with Selected Perceptual Voice Qualities – A Case Study with Creaky Voice

F. Rautenberg, F. Seebauer, J. Wiechmann, M. Kuhlmann, P. Wagner, R. Haeb-Umbach, in: Interspeech 2025, ISCA, 2025.

DOI

Target-Specific Dataset Pruning for Compression of Audio Tagging Models

A. Werning, R. Haeb-Umbach, in: 32nd European Signal Processing Conference (EUSIPCO 2024), 2024.

Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models

T. Gburrek, A.T. Meise, J. Schmalenstroeer, R. Haeb-Umbach, in: 2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC), IEEE, 2024.

DOI PDF

UPB-NT submission to DCASE24: Dataset pruning for targeted knowledge distillation

A. Werning, R. Haeb-Umbach, UPB-NT Submission to DCASE24: Dataset Pruning for Targeted Knowledge Distillation, 2024.

Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder

Y. Xie, M. Kuhlmann, F. Rautenberg, Z.-H. Tan, R. Häb-Umbach, in: 2024 32nd European Signal Processing Conference (EUSIPCO), 2024, pp. 436–440.

On the Integration of Sampling Rate Synchronization and Acoustic Beamforming

T. Gburrek, J. Schmalenstroeer, R. Haeb-Umbach, in: European Signal Processing Conference (EUSIPCO), 2023.

LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices

J. Schmalenstroeer, T. Gburrek, R. Haeb-Umbach, in: ITG Conference on Speech Communication, 2023.

PDF

On Feature Importance and Interpretability of Speaker Representations

F. Rautenberg, M. Kuhlmann, J. Wiechmann, F. Seebauer, P. Wagner, R. Haeb-Umbach, in: ITG Conference on Speech Communication, 2023.

arXiv

Explaining voice characteristics to novice voice practitioners-How successful is it?

J. Wiechmann, F. Rautenberg, P. Wagner, R. Haeb-Umbach, in: 20th International Congress of the Phonetic Sciences (ICPhS) , 2023.

Re-examining the quality dimensions of synthetic speech

F. Seebauer, M. Kuhlmann, R. Haeb-Umbach, P. Wagner, in: 12th Speech Synthesis Workshop (SSW) 2023, 2023.

Spatial Diarization for Meeting Transcription with Ad-Hoc Acoustic Sensor Networks

T. Gburrek, J. Schmalenstroeer, R. Haeb-Umbach, in: Proc. Asilomar Conference on Signals, Systems, and Computers, 2023.

PDF

Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual Speaker Characteristics

F. Rautenberg, M. Kuhlmann, J. Ebbers, J. Wiechmann, F. Seebauer, P. Wagner, R. Haeb-Umbach, in: Fortschritte Der Akustik - DAGA 2023, 2023, pp. 1409–1412.

PDF

Investigating Speaker Embedding Disentanglement on Natural Read Speech

M. Kuhlmann, A.T. Meise, F. Seebauer, P. Wagner, R. Häb-Umbach, in: Speech Communication; 15th ITG Conference, 2023, pp. 121–125.

DISCERNING DIMENSIONS OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH

F. Seebauer, M. Kuhlmann, R. Häb-Umbach, P. Wagner, in: Proceedings of the 20th International Congress of Phonetic Sciences, 2023.

Neural Network Based Carrier Frequency Offset Estimation From Speech Transmitted Over High Frequency Channels

J. Heitkämper, J. Schmalenstroeer, R. Haeb-Umbach, in: Proceedings of the 30th European Signal Processing Conference (EUSIPCO), Belgrad, n.d.

Data-driven Time Synchronization in Wireless Multimedia Networks

H. Afifi, H. Karl, T. Gburrek, J. Schmalenstroeer, in: 2022 International Wireless Communications and Mobile Computing (IWCMC), IEEE, 2022.

DOI

On Synchronization of Wireless Acoustic Sensor Networks in the Presence of Time-Varying Sampling Rate Offsets and Speaker Changes

T. Gburrek, J. Schmalenstroeer, R. Haeb-Umbach, in: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2022.

DOI PDF

Technically enabled explaining of voice characteristics

J. Wiechmann, T. Glarner, F. Rautenberg, P. Wagner, R. Haeb-Umbach, in: 18. Phonetik Und Phonologie Im Deutschsprachigen Raum (P&P), 2022.

PDF

Investigation into Target Speaking Rate Adaptation for Voice Conversion

M. Kuhlmann, F. Seebauer, J. Ebbers, P. Wagner, R. Haeb-Umbach, in: Interspeech 2022, ISCA, 2022.

DOI

Informed vs. Blind Beamforming in Ad-Hoc Acoustic Sensor Networks for Meeting Transcription

T. Gburrek, J. Schmalenstroeer, J. Heitkaemper, R. Haeb-Umbach, in: 2022 International Workshop on Acoustic Signal Enhancement (IWAENC), IEEE, 2022.

DOI PDF

A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network

T. Gburrek, C. Boeddeker, T. von Neumann, T. Cord-Landwehr, J. Schmalenstroeer, R. Haeb-Umbach, A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network, arXiv, 2022.

DOI PDF

A Database for Research on Detection and Enhancement of Speech Transmitted over HF links

J. Heitkaemper, J. Schmalenstroeer, V. Ion, R. Haeb-Umbach, in: Speech Communication; 14th ITG-Symposium, 2021, pp. 1–5.

A Comparison and Combination of Unsupervised Blind Source Separation Techniques

C. Boeddeker, F. Rautenberg, R. Haeb-Umbach, in: ITG Conference on Speech Communication, 2021.

PDF arXiv

Open Range Pitch Tracking for Carrier Frequency Difference Estimation from HF Transmitted Speech

J. Schmalenstroeer, J. Heitkaemper, J. Ullmann, R. Haeb-Umbach, in: 29th European Signal Processing Conference (EUSIPCO), 2021, pp. 1–5.

Geometry calibration in wireless acoustic sensor networks utilizing DoA and distance information

T. Gburrek, J. Schmalenstroeer, R. Haeb-Umbach, EURASIP Journal on Audio, Speech, and Music Processing (2021).

DOI

Iterative Geometry Calibration from Distance Estimates for Wireless Acoustic Sensor Networks

T. Gburrek, J. Schmalenstroeer, R. Haeb-Umbach, in: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021.

DOI PDF

On Source-Microphone Distance Estimation Using Convolutional Recurrent Neural Networks

T. Gburrek, J. Schmalenstroeer, R. Haeb-Umbach, in: Speech Communication; 14th ITG-Symposium, 2021, pp. 1–5.

PDF

Online Estimation of Sampling Rate Offsets in Wireless Acoustic Sensor Networks with Packet Loss

A. Chinaev, G. Enzner, T. Gburrek, J. Schmalenstroeer, in: 29th European Signal Processing Conference (EUSIPCO), 2021, pp. 1–5.

Contrastive Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised Learning of Disentangled Speech Representations

J. Ebbers, M. Kuhlmann, T. Cord-Landwehr, R. Haeb-Umbach, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 3860–3864.

PDF

Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments

J. Heitkaemper, J. Schmalenstroeer, R. Haeb-Umbach, in: INTERSPEECH 2020 Virtual Shanghai China, 2020.

Deep Neural Network based Distance Estimation for Geometry Calibration in Acoustic Sensor Network

T. Gburrek, J. Schmalenstroeer, A. Brendel, W. Kellermann, R. Haeb-Umbach, in: European Signal Processing Conference (EUSIPCO), 2020.

PDF

Front-End Processing for the CHiME-5 Dinner Party Scenario

C. Boeddeker, J. Heitkaemper, J. Schmalenstroeer, L. Drude, J. Heymann, R. Haeb-Umbach, in: Proc. CHiME 2018 Workshop on Speech Processing in Everyday Environments, Hyderabad, India, 2018.

Poster

MARVELO - A Framework for Signal Processing in Wireless Acoustic Sensor Networks

H. Afifi, J. Schmalenstroeer, J. Ullmann, R. Haeb-Umbach, H. Karl, in: Speech Communication; 13th ITG-Symposium, 2018, pp. 1–5.

Efficient Sampling Rate Offset Compensation - An Overlap-Save Based Approach

J. Schmalenstroeer, R. Haeb-Umbach, in: 26th European Signal Processing Conference (EUSIPCO 2018), 2018.

The RWTH/UPB System Combination for the CHiME 2018 Workshop

M. Kitza, W. Michel, C. Boeddeker, J. Heitkaemper, T. Menne, R. Schlüter, H. Ney, J. Schmalenstroeer, L. Drude, J. Heymann, R. Haeb-Umbach, in: Proc. CHiME 2018 Workshop on Speech Processing in Everyday Environments, Hyderabad, India, 2018.

Benchmarking Neural Network Architectures for Acoustic Sensor Networks

J. Ebbers, J. Heitkaemper, J. Schmalenstroeer, R. Haeb-Umbach, in: ITG 2018, Oldenburg, Germany, 2018.

Poster

Insights into the Interplay of Sampling Rate Offsets and MVDR Beamforming

J. Schmalenstroeer, R. Haeb-Umbach, in: ITG 2018, Oldenburg, Germany, 2018.

Fast and Accurate Audio Resampling for Acoustic Sensor Networks by Polyphase-Farrow Filters with FFT Realization

J. Schmalenstroeer, A. Chinaev, G. Enzner, in: Speech Communication; 13th ITG-Symposium, 2018, pp. 1–5.

Building or Enclosure Termination Closing and/or Opening Apparatus, and Method for Operating a Building or Enclosure Termination

F. Jacob, J. Schmalenstroeer, International Patent Number: WO2018/077610A, Patent Classification: WO2018/077610A, 2017.

Multi-Stage Coherence Drift Based Sampling Rate Synchronization for Acoustic Beamforming

J. Schmalenstroeer, J. Heymann, L. Drude, C. Boeddeker, R. Haeb-Umbach, in: IEEE 19th International Workshop on Multimedia Signal Processing (MMSP), 2017.

Poster

Investigations into Bluetooth Low Energy Localization Precision Limits

J. Schmalenstroeer, R. Haeb-Umbach, in: 24th European Signal Processing Conference (EUSIPCO 2016), 2016.

Poster

Aligning training models with smartphone properties in WiFi fingerprinting based indoor localization

M.K. Hoang, J. Schmalenstroeer, R. Haeb-Umbach, in: 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2015), 2015.

A combined hardware-software approach for acoustic sensor network synchronization

J. Schmalenstroeer, P. Jebramcik, R. Haeb-Umbach, Signal Processing (2014).

DOI

A Gossiping Approach to Sampling Clock Synchronization in Wireless Acoustic Sensor Networks

J. Schmalenstroeer, P. Jebramcik, R. Haeb-Umbach, in: 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), 2014.

Poster

Online Observation Error Model Estimation for Acoustic Sensor Network Synchronization

J. Schmalenstroeer, W. Zhao, R. Haeb-Umbach, in: 11. ITG Fachtagung Sprachkommunikation (ITG 2014), 2014.

Poster Demo

A Novel Initialization Method for Unsupervised Learning of Acoustic Patterns in Speech (FGNT-2013-01)

O. Walter, J. Schmalenstroeer, R. Haeb-Umbach, A Novel Initialization Method for Unsupervised Learning of Acoustic Patterns in Speech (FGNT-2013-01), 2013.

DoA-Based Microphone Array Position Self-Calibration Using Circular Statistic

F. Jacob, J. Schmalenstroeer, R. Haeb-Umbach, in: 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), 2013, pp. 116–120.

DOI Presentation

Sampling Rate Synchronisation in Acoustic Sensor Networks with a Pre-Trained Clock Skew Error Model

J. Schmalenstroeer, R. Haeb-Umbach, in: 21th European Signal Processing Conference (EUSIPCO 2013), 2013.

Presentation

Server based indoor navigation using RSSI and inertial sensor information

M.K. Hoang, S. Schmitz, C. Drueke, D.H.T. Vu, J. Schmalenstroeer, R. Haeb-Umbach, in: Positioning Navigation and Communication (WPNC), 2013 10th Workshop On, 2013, pp. 1–6.

DOI Poster

A Hidden Markov Model for Indoor User Tracking Based on WiFi Fingerprinting and Step Detection

M.K. Hoang, J. Schmalenstroeer, C. Drueke, D.H. Tran Vu, R. Haeb-Umbach, in: 21th European Signal Processing Conference (EUSIPCO 2013), 2013.

Poster

Microphone Array Position Self-Calibration from Reverberant Speech Input

F. Jacob, J. Schmalenstroeer, R. Haeb-Umbach, in: International Workshop on Acoustic Signal Enhancement (IWAENC 2012), 2012.

Video Poster Demonstrator

Smartphone-Based Sensor Fusion for Improved Vehicular Navigation

O. Walter, J. Schmalenstroeer, A. Engler, R. Haeb-Umbach, in: 9th Workshop on Positioning Navigation and Communication (WPNC 2012), 2012.

Unsupervised learning of acoustic events using dynamic time warping and hierarchical K-means++ clustering

J. Schmalenstroeer, M. Bartek, R. Haeb-Umbach, in: Interspeech 2011, 2011.

Unsupervised Geometry Calibration of Acoustic Sensor Networks Using Source Correspondences

J. Schmalenstroeer, F. Jacob, R. Haeb-Umbach, M. Hennecke, G.A. Fink, in: Interspeech 2011, 2011.

Investigations into Features for Robust Classification into Broad Acoustic Categories

J. Schmalenstroeer, M. Bartek, R. Haeb-Umbach, in: 37. Deutsche Jahrestagung Fuer Akustik (DAGA 2011), 2011.

Online Diarization of Streaming Audio-Visual Data for Smart Environments

J. Schmalenstroeer, R. Haeb-Umbach, IEEE Journal of Selected Topics in Signal Processing 4 (2010) 845–856.

DOI

Audio-Visual Data Processing for Ambient Communication

J. Schmalenstroeer, V. Leutnant, R. Haeb-Umbach, in: 1st International Workshop on Distributed Computing in Ambient Environments within 32nd Annual Conference on Artificial Intelligence, 2009.

PDF

A hierarchical approach to unsupervised shape calibration of microphone array networks

M. Hennecke, T. Ploetz, G.A. Fink, J. Schmalenstroeer, R. Haeb-Umbach, in: IEEE/SP 15th Workshop on Statistical Signal Processing (SSP 2009), 2009, pp. 257–260.

DOI

Fusing Audio and Video Information for Online Speaker Diarization

J. Schmalenstroeer, M. Kelling, V. Leutnant, R. Haeb-Umbach, in: Interspeech 2009, 2009.

Joint Speaker Segmentation, Localization and Identification for Streaming Audio

J. Schmalenstroeer, R. Haeb-Umbach, in: Interspeech 2007, 2007.

Amigo Context Management Service with Applications in Ambient Communication Scenarios

J. Schmalenstroeer, V. Leutnant, R. Haeb-Umbach, in: AMI-07 - European Conference on Ambient Intelligence, 2007.

Zweistufige Sprache/Pause-Detektion in stark gestoerter Umgebung

E. Warsitz, R. Haeb-Umbach, J. Schmalenstroeer, in: 33. Deutsche Jahrestagung Fuer Akustik (DAGA 2007), 2007.

Projekt Amigo - Sprachsignalverarbeitung im vernetzten Haus

J. Schmalenstroeer, E. Warsitz, R. Haeb-Umbach, in: 33. Deutsche Jahrestagung Fuer Akustik (DAGA 2007), 2007.

Online Speaker Change Detection by Combining BIC with Microphone Array Beamforming

J. Schmalenstroeer, R. Haeb-Umbach, in: Interspeech 2006, 2006.

Speech Processing in the Networked Home Environment - A View on the Amigo Project

R. Haeb-Umbach, J. Schmalenstroeer, in: Interspeech 2005, Lisboa, 2005.

A Comparison of Particle Filtering Variants for Speech Feature Enhancement

R. Haeb-Umbach, J. Schmalenstroeer, in: Interspeech 2005, 2005.