default search action
Yang Ai
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j12]Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1430-1444 (2024) - [j11]Yang Ai, Zhen-Hua Ling:
Low-Latency Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2283-2296 (2024) - [j10]Yang Ai, Xiao-Hang Jiang, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling:
APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3256-3269 (2024) - [c26]Kangdi Mei, Zhaoci Liu, Hui-Peng Du, Hengyu Li, Yang Ai, Liping Chen, Zhenhua Ling:
Considering Temporal Connection between Turns for Conversational Speech Synthesis. ICASSP 2024: 11426-11430 - [c25]Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target Generation. ACM Multimedia 2024: 6559-6568 - [i27]Ye-Xin Lu, Yang Ai, Hui-Peng Du, Zhen-Hua Ling:
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction. CoRR abs/2401.06387 (2024) - [i26]Yang Ai, Xiao-Hang Jiang, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling:
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding. CoRR abs/2402.10533 (2024) - [i25]Yang Ai, Zhen-Hua Ling:
Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks. CoRR abs/2403.17378 (2024) - [i24]Zhengyan Sheng, Yang Ai, Li-Juan Liu, Jia Pan, Zhen-Hua Ling:
Voice Attribute Editing with Text Prompt. CoRR abs/2404.08857 (2024) - [i23]Hui-Peng Du, Ye-Xin Lu, Yang Ai, Zhen-Hua Ling:
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation. CoRR abs/2406.02162 (2024) - [i22]Ye-Xin Lu, Yang Ai, Zheng-Yan Sheng, Zhen-Hua Ling:
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control. CoRR abs/2406.02250 (2024) - [i21]Hengyu Li, Kangdi Mei, Zhaoci Liu, Yang Ai, Liping Chen, Jie Zhang, Zhenhua Ling:
Refining Self-Supervised Learnt Speech Representation using Brain Activations. CoRR abs/2406.08266 (2024) - [i20]Fei Liu, Yang Ai, Hui-Peng Du, Ye-Xin Lu, Rui-Chen Zheng, Zhen-Hua Ling:
Stage-Wise and Prior-Aware Neural Speech Phase Prediction. CoRR abs/2410.04990 (2024) - [i19]Hui-Peng Du, Yang Ai, Rui-Chen Zheng, Zhen-Hua Ling:
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm. CoRR abs/2410.22807 (2024) - [i18]Xiao-Hang Jiang, Yang Ai, Rui-Chen Zheng, Hui-Peng Du, Ye-Xin Lu, Zhen-Hua Ling:
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios. CoRR abs/2411.00464 (2024) - 2023
- [j9]Yang Ai, Ye-Xin Lu, Zhen-Hua Ling:
Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation. IEEE Signal Process. Lett. 30: 1097-1101 (2023) - [j8]Yang Ai, Zhen-Hua Ling:
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2145-2157 (2023) - [c24]Haochen Wu, Zhuhai Li, Luzhen Xu, Zhentao Zhang, Wenting Zhao, Bin Gu, Yang Ai, Yexin Lu, Jie Zhang, Zhenhua Ling, Wu Guo:
The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge. DADA@IJCAI 2023: 119-124 - [c23]Jing Liu, Jing Ma, Yang Ai, Jiayue Zhao, Fang Wang, Lanfen Lin, Ruofeng Tong, Yen-Wei Chen, Jingsong Li:
Vision-Guided Attention-Enhanced Network for Predicting Microvascular Invasion in Hepatocellular Carcinoma. EMBC 2023: 1-4 - [c22]Jing Liu, Yulin Yang, Yang Ai, Titinunt Kitrungrotsakul, Fang Wang, Lanfen Lin, Ruofeng Tong, Yen-Wei Chen, Jingsong Li:
MVI-Wise GAN: Synthetic MRI to Improve Microvascular Invasion Prediction in Hepatocellular Carcinoma. EMBC 2023: 1-4 - [c21]Yang Ai, Yinhao Li, Rahul Kumar Jain, Yen-Wei Chen:
A Self-Attention Based Fusion Model of Radiomics and Deep Features for Early Recurrence Prediction in NSCLC. GCCE 2023: 833-837 - [c20]Yang Ai, Zhen-Hua Ling:
Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses. ICASSP 2023: 1-5 - [c19]Zhengyan Sheng, Yang Ai, Zhen-Hua Ling:
Zero-Shot Personalized Lip-To-Speech Synthesis with Face Image Based Voice Control. ICASSP 2023: 1-5 - [c18]Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Speech Reconstruction from Silent Tongue and Lip Articulation by Pseudo Target Generation and Domain Adversarial Training. ICASSP 2023: 1-5 - [c17]Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation. INTERSPEECH 2023: 844-848 - [c16]Ye-Xin Lu, Yang Ai, Zhen-Hua Ling:
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra. INTERSPEECH 2023: 3834-3838 - [c15]Jing Liu, Yang Ai, Chao Huang, Fang Wang, Yingying Xu, Titinunt Kitrungrotsakul, Jing Ma, Lanfen Lin, Yen-Wei Chen, Jingsong Li:
CMIR: A Unified Cross-Modality Framework for Preoperative Accurate Prediction of Microvascular Invasion in Hepatocellular Carcinoma. MedInfo 2023: 936-940 - [c14]Zhengyan Sheng, Yang Ai, Yan-Nian Chen, Zhen-Hua Ling:
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment. ACM Multimedia 2023: 8443-8452 - [i17]Ye-Xin Lu, Yang Ai, Zhen-Hua Ling:
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis. CoRR abs/2304.13270 (2023) - [i16]Yang Ai, Zhen-Hua Ling:
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra. CoRR abs/2305.07952 (2023) - [i15]Zhengyan Sheng, Yang Ai, Zhen-Hua Ling:
Zero-shot personalized lip-to-speech synthesis with face image based voice control. CoRR abs/2305.14359 (2023) - [i14]Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation. CoRR abs/2305.14933 (2023) - [i13]Yang Ai, Ye-Xin Lu, Zhen-Hua Ling:
Long-frame-shift Neural Speech Phase Prediction with Spectral Continuity Enhancement and Interpolation Error Compensation. CoRR abs/2308.08850 (2023) - [i12]Ye-Xin Lu, Yang Ai, Zhen-Hua Ling:
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement. CoRR abs/2308.08926 (2023) - [i11]Zhengyan Sheng, Yang Ai, Yan-Nian Chen, Zhen-Hua Ling:
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment. CoRR abs/2309.09470 (2023) - [i10]Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement. CoRR abs/2309.10455 (2023) - [i9]Yang Ai, Xi Yang:
A Dynamic Network for Efficient Point Cloud Registration. CoRR abs/2312.02877 (2023) - 2022
- [j7]Zilong Liu, Jingbing Li, Yang Ai, Yuancai Zheng, Jing Liu:
A robust encryption watermarking algorithm for medical images based on ridgelet-DCT and THM double chaos. J. Cloud Comput. 11: 60 (2022) - [j6]Yang Ai, Zhen-Hua Ling, Wei-Lu Wu, Ang Li:
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2036-2048 (2022) - [c13]Yang Ai, Panyanat Aonpong, Weibin Wang, Yinhao Li, Yutaro Iwamoto, Xianhua Han, Yen-Wei Chen:
Residual Multilayer Perceptrons for Genotype-Guided Recurrence Prediction of Non-Small Cell Lung Cancer. EMBC 2022: 447-450 - [i8]Yang Ai, Zhen-Hua Ling:
Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses. CoRR abs/2211.15974 (2022) - 2021
- [j5]Kun Shao, Junan Yang, Yang Ai, Hui Liu, Yu Zhang:
BDDR: An Effective Defense Against Textual Backdoor Attacks. Comput. Secur. 110: 102433 (2021) - [c12]Chang Liu, Yang Ai, Zhenhua Ling:
Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones. ISCSLP 2021: 1-5 - [c11]Yang Ai, Haoyu Li, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling:
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation. SLT 2021: 477-484 - [c10]Haoyu Li, Yang Ai, Junichi Yamagishi:
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model. SLT 2021: 734-741 - 2020
- [j4]Jing Liu, Jixin Ma, Jingbing Li, Mengxing Huang, Naveed Sadiq, Yang Ai:
Robust Watermarking Algorithm for Medical Volume Data in Internet of Medical Things. IEEE Access 8: 93939-93961 (2020) - [j3]Yang Ai, Zhen-Hua Ling:
A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 28: 839-851 (2020) - [c9]Qiuchen Huang, Yang Ai, Zhenhua Ling:
Online Speaker Adaptation for WaveNet-based Neural Vocoders. APSIPA 2020: 815-820 - [c8]Yang Ai, Zhen-Hua Ling:
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders. INTERSPEECH 2020: 190-194 - [c7]Yang Ai, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling:
Reverberation Modeling for Source-Filter-Based Neural Vocoder. INTERSPEECH 2020: 3560-3564 - [i7]Yang Ai, Zhen-Hua Ling:
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders. CoRR abs/2004.07832 (2020) - [i6]Yang Ai, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling:
Reverberation Modeling for Source-Filter-based Neural Vocoder. CoRR abs/2005.07379 (2020) - [i5]Yang Ai, Haoyu Li, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling:
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation. CoRR abs/2011.03955 (2020) - [i4]Haoyu Li, Yang Ai, Junichi Yamagishi:
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model. CoRR abs/2011.05038 (2020)
2010 – 2019
- 2019
- [j2]Jing Liu, Jingbing Li, Kun Zhang, Uzair Aslam Bhatti, Yang Ai:
Zero-Watermarking Algorithm for Medical Images Based on Dual-Tree Complex Wavelet Transform and Discrete Cosine Transform. J. Medical Imaging Health Informatics 9(1): 188-194 (2019) - [c6]Yuan Jiang, Ya-Jun Hu, Li-Juan Liu, Hong-Chuan Wu, Zhi-Kun Wang, Yang Ai, Zhen-Hua Ling, Li-Rong Dai:
The USTC System for Blizzard Challenge 2019. Blizzard Challenge 2019 - [c5]Yang Ai, Jing-Xuan Zhang, Liang Chen, Zhen-Hua Ling:
Dnn-based Spectral Enhancement for Neural Waveform Generators with Low-bit Quantization. ICASSP 2019: 7025-7029 - [c4]Yuan-Hao Yi, Yang Ai, Zhen-Hua Ling, Li-Rong Dai:
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling. INTERSPEECH 2019: 2593-2597 - [i3]Yuan-Hao Yi, Yang Ai, Zhen-Hua Ling, Li-Rong Dai:
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling. CoRR abs/1906.08977 (2019) - [i2]Yang Ai, Zhen-Hua Ling:
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis. CoRR abs/1906.09573 (2019) - 2018
- [j1]Zhen-Hua Ling, Yang Ai, Yu Gu, Li-Rong Dai:
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension. IEEE ACM Trans. Audio Speech Lang. Process. 26(5): 883-894 (2018) - [c3]Yang Ai, Hong-Chuan Wu, Zhen-Hua Ling:
Samplernn-Based Neural Vocoder for Statistical Parametric Speech Synthesis. ICASSP 2018: 5659-5663 - [i1]Zhen-Hua Ling, Yang Ai, Yu Gu, Li-Rong Dai:
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension. CoRR abs/1801.07910 (2018) - 2010
- [c2]Hao Xu, Changhai Zhang, Yang Ai, Ziwen Wang, Zhan-Shan Li:
An Ontology-Based Platform for Scientific Writing and Publishing. FGIT 2010: 267-271
2000 – 2009
- 2009
- [c1]Kun Wang, Zhan-Shan Li, Yang Ai, Yonggang Zhang:
Computing Minimal Diagnosis with Binary Decision Diagrams Algorithm. FSKD (1) 2009: 145-149
Coauthor Index
aka: Zhenhua Ling
aka: Yexin Lu
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-13 20:06 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint