Yang Ai (艾杨)
Postdoctoral Researcher
National Engineering Research Center of Speech and Language Information Processing,
University of Science and Technology of China,
Hefei, Anhui, P.R. China, 230027
Email: yangai@ustc.edu.cn
Tel: +86-13605693217
Biography
Yang Ai received the B.S. degree in communication engineering from Xiamen University (XMU), Xiamen, China, in 2016, and Ph.D. degree in signal and information processing from the University of Science and Technology of China (USTC), Hefei, China, in 2021. From February 2020 to August 2020, he was a visiting Ph.D. student at National Institute of Informatics (NII), Tokyo, Japan. He also worked at the National University of Defense Technology (NUDT), Hefei, China, as a lecturer from July 2021 to March 2022. He is currently a Postdoctoral Researcher with the University of Science and Technology of China (USTC), Hefei, China.
Research Interests
Speech Synthesis
- Statistical Parametric Speech Synthesis
- Deep Learning for Speech Synthesis
- Neural Waveform Generation
- Neural Vocoder
- Speech Bandwidth Extension
Speech Enhancement
- Speech Denoising and Dereverberation
- Deep Learning for Speech Enhancement
Publications
- Yang Ai*, Zhen-Hua Ling, Wei-Lu Wu and Ang Li, “Denoising-and-dereverberation hierarchical neural vocoder for statistical parametric speech synthesis,”IEEE/ACM Transactions on Audio, Speech, and Language Processing, Accepted, 2022.
- Yang Ai and Zhen-Hua Ling*, “A neural vocoder with hierarchical generation of amplitude and phase spectra for statistical parametric speech synthesis,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 839–851, 2020.
- Zhen-Hua Ling*, Yang Ai, Yu Gu, and Li-Rong Dai, “Waveform modeling and generation using hierarchical recurrent neural networks for speech bandwidth extension,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 5, pp. 883–894, 2018.
- Yang Ai, Hao-Yu Li, Xin Wang, Junichi Yamagishi and Zhen-Hua Ling, “Denoising-and-dereverberation hierarchical neural vocoder for robust waveform generation,” in Proc. SLT, 2021, pp. 477-484.
- Yang Ai and Zhen-Hua Ling, “Knowledge-and-data-driven amplitude spectrum prediction for hierarchical neural vocoders,” in Proc. Interspeech, 2020, pp. 190-194.
- Yang Ai, Xin Wang, Junichi Yamagishi and Zhen-Hua Ling, “Reverberation modeling for source-filter-based neural vocoder,” in Proc. Interspeech, 2020, pp.3560-3564.
- Yang Ai, Jing-Xuan Zhang, Liang Chen, and Zhen-Hua Ling, “DNN-based spectral enhancement for neural waveform generators with low-bit quantization,” in Proc. ICASSP, 2019, pp. 7025-7029.
- Yang Ai, Hong-Chuan Wu, and Zhen-Hua Ling, “SampleRNN-based neural vocoder for statistical parametric speech synthesis,” in Proc. ICASSP, 2018, pp. 5659-5663.
- Hao-Yu Li, Yang Ai, and Junichi Yamagishi, “Enhancing low-quality voice recordings using disentangled channel factor and neural waveform model,” in Proc. SLT, 2021, pp. 2452-2456.
- Chang Liu, Yang Ai, and Zhen-Hua Ling, “Phase spectrum recovery for enhancing low-quality speech captured by laser microphones,” in Proc. ISCSLP, 2021, pp. 1-5.
- Qiu-Chen Huang, Yang Ai, and Zhen-Hua Ling, “Online speaker adaptation for WaveNet-based neural vocoders,” in Proc. APSIPA, 2020, pp. 815-820.
- Yuan-Hao Yi, Yang Ai, Zhen-Hua Ling, and Li-Rong Dai, “Singing voice synthesis using deep autoregressive neural networks for acoustic modeling,” in Proc. Interspeech, 2019, pp. 2593–2597.
- Yuan Jiang, Ya-Jun Hu, Li-Juan Liu, Hong-Chuan Wu, Zhi-Kun Wang, Yang Ai, Zhen-Hua Ling, and Li-Rong Dai, “The USTC system for blizzard challenge 2019,”in Blizzard Challenge Workshop, 2019.
- Kun Shao, Jun-An Yang, Yang Ai, Hui Liu and Yu Zhang, “BDDR: An Effective Defense Against Textual Backdoor Attacks,” Computers & Security, vol. 110, pp. 102433, 2021.
Awards
- The best system for Blizzard Challenge (2018).
- The best system for Blizzard Challenge (2019).
- National Scholarship for Ph.D. Students (2020).
- Honorary Title of Outstanding Graduates of University of Science and Technology of China (2021).
Service
- Reviewer for IEEE/ACM Transactions on Audio, Speech and Language Processing (2020).