Binaural rendering of spherical microphone array recordings by directly synthesizing the spatial pattern of the head-related transfer function

Shuichi Sakamoto, César Salvador, Jorge Treviño, Yôiti Suzuki

Research output: Contribution to conferencePaperpeer-review

Abstract

Binaural technologies can convey rich spatial auditory information to listeners using simple equipment such as headphones. Advanced binaural recording and reproduction methods use spherical microphone arrays and head-related transfer function (HRTF) datasets. Mainstream techniques, such as binaural Ambisonics, characterize the recorded sound field as a weighted sum of spherical harmonics functions. In contrast, this research seeks to generate individualized binaural signals directly from the microphone recordings, without relying on intermediate sound field representations. The approach, known as SENZI, applies a set of weighting filters to the recorded microphone signals resulting in the target spatial pattern defined by the HRTF dataset. In this sense, the proposal requires finding the appropriate weighting filters by inverting a linear system. Binaural synthesis methods based on the solution to an inverse problem belong to one of two categories: HRTF modeling (type 1) or microphone signal modeling (type 2). The SENZI method considered here belongs to the HRTF modeling category. In addition, the problem is generally over- or underdetermined, depending on the number of microphones in the array and HRTFs in the dataset. This also impacts the accuracy of the synthesized binaural signals. A design problem, therefore, is to choose the most appropriate number of microphones and HRTFs. Fortunately, large HRTF datasets, as well as massively multi-channel arrays are now available. An example of the latter is a real-time implementation of the SENZI method using a 252-channel spherical microphone array and a FPGA-based processing subsystem. This research evaluates the binaural synthesis accuracy in relation to the number of microphones and HRTFs used to derive the weighting filters. Numerical simulations show that underdetermined systems generally yield better results than overdetermined ones.

Original languageEnglish
Publication statusPublished - 2017
Event24th International Congress on Sound and Vibration, ICSV 2017 - London, United Kingdom
Duration: 2017 Jul 232017 Jul 27

Conference

Conference24th International Congress on Sound and Vibration, ICSV 2017
Country/TerritoryUnited Kingdom
CityLondon
Period17/7/2317/7/27

Keywords

  • 3D audio technology
  • Binaural synthesis
  • Head-related transfer functions
  • Microphone arrays
  • Spherical acoustics

Fingerprint

Dive into the research topics of 'Binaural rendering of spherical microphone array recordings by directly synthesizing the spatial pattern of the head-related transfer function'. Together they form a unique fingerprint.

Cite this