[C27] Application of Adversarial Domain Adaptation to Voice Activity Detection


Voice Activity Detection(VAD) is becoming an essential front-end component in various speech processing systems. As those sys- tems are commonly deployed in environments with diverse noise types and low signal-to-noise ratios (SNRs), an effective VAD method should perform robust detection of speech region out of noisy background sig- nals. In this paper, we propose applying an adversarial domain adapta- tion technique to VAD. The proposed method trains DNN models for a VAD task in a supervised manner, simultaneously mitigating the problem of area mismatch between noisy and clean audio stream in a unsuper- vised manner. The experimental results show that the proposed method improves robust detection performance in noisy environments compared to other DNN-based model learned with hand-crafted acoustic feature.

Intelligent Systems Conference (IntelliSys) 2021