115 noises data

115 noises data used to construct our best 16kHz DNN-based speech enhancement system (submitted to Interspeech2015 [3], demo)

*Yong Xu, Jun Du, Li-Rong Dai and Chin-Hui LEE, Fellow, IEEE

E-mail: xuyong62@mail.ustc.edu.cn, *Yong Xu

The abundance of noise types in the DNN training process for the speech enhancement task had been demonstrated important to get better generalization capacity of DNN [1].

For N1-N100 noises, they were collected by Guoning Hu, and could be downloaded at http://web.cse.ohio-state.edu/pnl/corpus/HuNonspeech/HuCorpus.html ,zip1

However, we expanded them up to 115 nosies, e.g., some musical noises which are very common in the real world acoustic environments.

The other 15 home-made noise types (by USTC): (download at: zip2)

N101: AWGN;

N102: Babble;

N103-N105: Car;

N106-N115: musical instruments.

[1] A Regression Approach to Speech Enhancement Based on Deep Neural Networks [pdf], Yong Xu, Jun Du, Li-Rong Dai and Chin-Hui Lee, IEEETransactions on Acoustics, Speech and Signal Processing, vol. 23,no. 1, pp. 7–19, 2015

[2] An Experimental Study on Speech Enhancement Based on Deep Neural Networks [pdf], Yong Xu, Jun Du, Li-Rong Dai and Chin-Hui Lee IEEE SIGNAL PROCESSING LETTERS, P. 65-68, VOL. 21, NO. 1, JANUARY 2014

[3] Multi-objective Learning and Mask-based Post-processing for Deep Neural Network based Speech Enhancement, Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai and Chin-Hui Lee, submitted to Interspeech2015, demo