default search action
Xinfa Zhu
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [i18]Xinfa Zhu, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie:
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training. CoRR abs/2501.04416 (2025) - [i17]Xuelong Geng, Kun Wei, Qijie Shao, Shuiyun Liu, Zhennan Lin, Zhixian Zhao, Guojian Li, Wenjie Tian, Peikun Chen, Yangze Li, Pengcheng Guo, Mingchen Shao, Shuiyuan Wang, Yuang Cao, Chengyou Wang, Tianyi Xu, Yuhang Dai, Xinfa Zhu, Yue Li, Li Zhang, Lei Xie:
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia. CoRR abs/2501.13306 (2025) - [i16]Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Xi Wang, Sheng Zhao, Lei Xie:
CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions. CoRR abs/2501.16761 (2025) - 2024
- [j4]Xinfa Zhu
, Yi Lei
, Tao Li
, Yongmao Zhang
, Hongbin Zhou
, Heng Lu
, Lei Xie
:
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1506-1518 (2024) - [j3]Tao Li
, Zhichao Wang
, Xinfa Zhu
, Jian Cong, Qiao Tian
, Yuping Wang, Lei Xie
:
U-Style: Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4026-4035 (2024) - [c10]Ziqian Wang, Xinfa Zhu, Zihan Zhang, Yuanjun Lv, Ning Jiang, Guoqing Zhao, Lei Xie:
SELM: Speech Enhancement using Discrete Tokens and Language Models. ICASSP 2024: 11561-11565 - [c9]Hanzhao Li, Xinfa Zhu, Liumeng Xue, Yang Song, Yunlin Chen, Lei Xie:
Spontts: Modeling and Transferring Spontaneous Style for TTS. ICASSP 2024: 12171-12175 - [c8]Xinfa Zhu, Yuke Li, Yi Lei, Ning Jiang, Guoqing Zhao, Lei Xie:
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-Supervised Contrastive Learning. ICME 2024: 1-6 - [c7]Dake Guo, Jixun Yao, Xinfa Zhu, Kangxiang Xia, Zhao Guo, Ziyu Zhang, Yao Wang, Jie Liu, Lei Xie:
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge. ISCSLP 2024: 616-620 - [c6]Yujia Xiao
, Xi Wang
, Xu Tan
, Lei He
, Xinfa Zhu
, Sheng Zhao
, Tan Lee
:
Contrastive Context-Speech Pretraining for Expressive Text-to-Speech Synthesis. ACM Multimedia 2024: 2099-2107 - [c5]Xinfa Zhu
, Wenjie Tian
, Xinsheng Wang
, Lei He
, Yujia Xiao
, Xi Wang
, Xu Tan
, Sheng Zhao
, Lei Xie
:
UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis. ACM Multimedia 2024: 7513-7522 - [i15]Linhan Ma, Xinfa Zhu, Yuanjun Lv, Zhichao Wang, Ziqian Wang, Wendi He, Hongbin Zhou, Lei Xie:
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy. CoRR abs/2406.09844 (2024) - [i14]Dake Guo, Jixun Yao, Xinfa Zhu, Kangxiang Xia, Zhao Guo, Ziyu Zhang, Yao Wang, Jie Liu, Lei Xie:
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge. CoRR abs/2410.23815 (2024) - [i13]Yuke Li, Xinfa Zhu, Hanzhao Li, Jixun Yao, Wenjie Tian, XiPeng Yang, YunLin Chen, Zhifei Li, Lei Xie:
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion. CoRR abs/2411.18918 (2024) - [i12]Zihao Chen, Haomin Zhang, Xinhan Di, Haoyu Wang, Sizhe Shan, Junjie Zheng, Yunming Liang, Yihan Fan, Xinfa Zhu, Wenjie Tian, Yihua Wang, Chaofan Ding, Lei Xie:
YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls. CoRR abs/2412.09168 (2024) - [i11]Xinfa Zhu, Wenjie Tian, Lei Xie:
Autoregressive Speech Synthesis with Next-Distribution Prediction. CoRR abs/2412.16846 (2024) - 2023
- [j2]Tao Li
, Chenxu Hu, Jian Cong, Xinfa Zhu, Jingbei Li
, Qiao Tian, Yuping Wang, Lei Xie
:
DiCLET-TTS: Diffusion Model Based Cross-Lingual Emotion Transfer for Text-to-Speech - A Study Between English and Mandarin. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3418-3430 (2023) - [c4]Dake Guo, Xinfa Zhu, Liumeng Xue, Tao Li, Yuanjun Lv, Yuepeng Jiang, Lei Xie:
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS. ASRU 2023: 1-7 - [c3]Yuke Li, Xinfa Zhu, Yi Lei, Hai Li, Junhui Liu, Danming Xie, Lei Xie:
Zero-Shot Emotion Transfer for Cross-Lingual Speech Synthesis. ASRU 2023: 1-8 - [c2]Xinfa Zhu
, Yi Lei, Kun Song, Yongmao Zhang, Tao Li, Lei Xie:
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling. ICASSP 2023: 1-5 - [i10]Tao Li, Chenxu Hu, Jian Cong, Xinfa Zhu, Jingbei Li, Qiao Tian, Yuping Wang, Lei Xie:
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech - A Study between English and Mandarin. CoRR abs/2309.00883 (2023) - [i9]Dake Guo, Xinfa Zhu, Liumeng Xue, Tao Li, Yuanjun Lv, Yuepeng Jiang, Lei Xie:
HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS. CoRR abs/2309.13907 (2023) - [i8]Yuke Li, Xinfa Zhu, Yi Lei, Hai Li, Junhui Liu, Danming Xie, Lei Xie:
Zero-Shot Emotion Transfer For Cross-Lingual Speech Synthesis. CoRR abs/2310.03963 (2023) - [i7]Tao Li, Zhichao Wang, Xinfa Zhu, Jian Cong, Qiao Tian, Yuping Wang, Lei Xie:
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning. CoRR abs/2310.04004 (2023) - [i6]Xinfa Zhu, Yuanjun Lv, Yi Lei, Tao Li, Wendi He, Hongbin Zhou, Heng Lu, Lei Xie:
Vec-Tok Speech: speech vectorization and tokenization for neural speech generation. CoRR abs/2310.07246 (2023) - [i5]Xinfa Zhu, Yuke Li, Yi Lei, Ning Jiang, Guoqing Zhao, Lei Xie:
Multi-Speaker Expressive Speech Synthesis via Semi-supervised Contrastive Learning. CoRR abs/2310.17101 (2023) - [i4]Hanzhao Li, Xinfa Zhu, Liumeng Xue, Yang Song, Yunlin Chen, Lei Xie:
SponTTS: modeling and transferring spontaneous style for TTS. CoRR abs/2311.07179 (2023) - [i3]Linhan Ma, Yongmao Zhang, Xinfa Zhu, Yi Lei, Ziqian Ning, Pengcheng Zhu, Lei Xie:
Accent-VITS: accent transfer for end-to-end TTS. CoRR abs/2312.16850 (2023) - 2022
- [j1]Yi Lei
, Shan Yang, Xinfa Zhu, Lei Xie
, Dan Su:
Cross-Speaker Emotion Transfer Through Information Perturbation in Emotional Speech Synthesis. IEEE Signal Process. Lett. 29: 1948-1952 (2022) - [c1]Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Yingying Gao, Lei Xie:
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge. ISCSLP 2022: 532-536 - [i2]Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Lei Xie:
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge. CoRR abs/2210.14448 (2022) - [i1]Xinfa Zhu, Yi Lei, Kun Song, Yongmao Zhang, Tao Li, Lei Xie:
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling. CoRR abs/2211.10568 (2022)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-04 02:00 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint