• Julian Reck
  • 01.12.2015
  • 02.05.2016

In everyday life, humans are constantly surrounded by 3D sound in all kinds of environments. A method to reproduce spatial sound is, e.g., binaural rendering. With this approach, it is possible to synthesize virtual sound sources at arbitrary positions by evoking properly designed sound fields at the listener's ears. Thereby, the listener ideally has the impression of being immersed in the original acoustic scene, such as in an opera house. In the simplest case, binaural rendering can be done via headphones. However, the listener is then acoustically isolated from the environment and headphones may be uncomfortable when wearing them for a long time. Therefore, untethered sound rendering via distant loudspeakers is desirable, which increases the listener comfort. The main challenge of this approach is that each loudspeaker signal arrives at each ear and, thus, generates undesired cross-talk. That is, the left loudspeaker signal will also be captured by the right ear (and vice versa), which corrupts the desired hearing impression.



The undesired cross-talk components can be attenuated by exploiting the acoustic information of the propagation paths from the loudspeakers to the listener’s ears, which are referred to as Head-Related Transfer Functions (HRTFs). In order to achieve the best performance, the HRTFs should be accurately measured for each individual listener at the actual listening position. Since these measurements are tedious, it is more practical to approximate the HRTFs using a model or a universal HRTF database, e.g., obtained from measurements with an artificial head (manikin). Of course, this approximation implies a mismatch between the utilized and the actual HRTFs of the listener, which may lead to a severe deterioration of the reproduction performance.



In this thesis, a binaural rendering system shall be evaluated with respect to its robustness against HRTF mismatch. Furthermore, modifications of the already implemented system which lead to a higher robustness shall be investigated. As a starting point, we consider a single-listener scenario in an anechoic environment. The binaural rendering system is based on robust superdirective beamforming, where different available HRTFs and a model can be utilized for the design. The performance of the binaural rendering system is evaluated in terms of channel separation.