TY - GEN
T1 - Unsupervised Adaptation of Neural Networks for Discriminative Sound Source Localization with Eliminative Constraint
AU - Takeda, Ryu
AU - Kudo, Yoshiki
AU - Takashima, Kazuki
AU - Kitamura, Yoshifumi
AU - Komatani, Kazunori
N1 - Funding Information:
Acknowledgement This work was partly supported by JSPS KAK-ENHI Grant Numbers JP16H02869 and the Cooperative Research Project Program of the RIEC, Tohoku University.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/10
Y1 - 2018/9/10
N2 - This paper describes an unsupervised adaptation method of deep neural networks (DNNs) regarding discriminative sound source localization (SSL). DNNs-based SSL and its unsupervised adaptation fail under different conditions from those during training. The estimations sometimes include incoherent unpredictable errors due to the NN's non-linearity. We propose an eliminative posterior probability constraint using a model-based SSL for unsupervised DNNs adaptation. This constraint forces the probability of 'less possible candidates' to become zero to eliminate incoherent errors. The candidates are indicated by a model-based SSL method because it can estimate the azimuth of the sound source with moderate accuracy and explicit reasoning. As a result, the localization performance of adapted DNNs improved more than that of model-based SSL. Experimental results showed that our method improved localization correctness of 1D azimuth and 3D regions by a maximum of 13.3 and 5.9 points compared with the model-based SSL.
AB - This paper describes an unsupervised adaptation method of deep neural networks (DNNs) regarding discriminative sound source localization (SSL). DNNs-based SSL and its unsupervised adaptation fail under different conditions from those during training. The estimations sometimes include incoherent unpredictable errors due to the NN's non-linearity. We propose an eliminative posterior probability constraint using a model-based SSL for unsupervised DNNs adaptation. This constraint forces the probability of 'less possible candidates' to become zero to eliminate incoherent errors. The candidates are indicated by a model-based SSL method because it can estimate the azimuth of the sound source with moderate accuracy and explicit reasoning. As a result, the localization performance of adapted DNNs improved more than that of model-based SSL. Experimental results showed that our method improved localization correctness of 1D azimuth and 3D regions by a maximum of 13.3 and 5.9 points compared with the model-based SSL.
KW - Neural networks
KW - Sound source localization
KW - Unsupervised adaptation
UR - http://www.scopus.com/inward/record.url?scp=85054254350&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054254350&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2018.8461723
DO - 10.1109/ICASSP.2018.8461723
M3 - Conference contribution
AN - SCOPUS:85054254350
SN - 9781538646588
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 3514
EP - 3518
BT - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Y2 - 15 April 2018 through 20 April 2018
ER -