default search action
Shixiong Zhang 0001
Person information
- affiliation: Capital One, USA
- affiliation: Tencent AI Lab, Bellevue, WA, USA
- affiliation: Microsoft Corporation, Redmond, WA, USA
- affiliation (PhD 2018): Cambridge University, Engineering Department, Machine Intelligence Laboratory, Cambridge UK
Other persons with the same name
- Shixiong Zhang (aka: Shi-Xiong Zhang) — disambiguation page
- Shixiong Zhang 0002 — Xidian University, School of Computer Science and Technology, Xi'an, China (and 1 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c49]Yaoxun Xu, Hangting Chen, Jianwei Yu, Qiaochu Huang, Zhiyong Wu, Shi-Xiong Zhang, Guangzhi Li, Yi Luo, Rongzhi Gu:
SECap: Speech Emotion Captioning with Large Language Model. AAAI 2024: 19323-19331 - [c48]Zili Huang, Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
UniX-Encoder: A Universal X-Channel Speech Encoder for AD-HOC Microphone Array Speech Processing. ICASSP 2024: 11991-11995 - [i44]Zheshu Song, Jianheng Zhuo, Yifan Yang, Ziyang Ma, Shixiong Zhang, Xie Chen:
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR. CoRR abs/2406.06619 (2024) - [i43]Mohan Shi, Zengrui Jin, Yaoxun Xu, Yong Xu, Shi-Xiong Zhang, Kun Wei, Yiwen Shao, Chunlei Zhang, Dong Yu:
Advancing Multi-talker ASR Performance with Large Language Models. CoRR abs/2408.17431 (2024) - [i42]Yaoxun Xu, Shi-Xiong Zhang, Jianwei Yu, Zhiyong Wu, Dong Yu:
Comparing Discrete and Continuous Space LLMs for Speech Recognition. CoRR abs/2409.00800 (2024) - [i41]Zengrui Jin, Yifan Yang, Mohan Shi, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel Povey:
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization. CoRR abs/2409.00819 (2024) - [i40]Genta Indra Winata, Hanyang Zhao, Anirban Das, Wenpin Tang, David D. Yao, Shi-Xiong Zhang, Sambit Sahu:
Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey. CoRR abs/2409.11564 (2024) - [i39]Akshaj Kumar Veldanda, Shi-Xiong Zhang, Anirban Das, Supriyo Chakraborty, Stephen Rawls, Sambit Sahu, Milind R. Naphade:
LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models. CoRR abs/2409.13054 (2024) - [i38]Hanyang Zhao, Genta Indra Winata, Anirban Das, Shi-Xiong Zhang, David D. Yao, Wenpin Tang, Sambit Sahu:
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization. CoRR abs/2410.04203 (2024) - 2023
- [j12]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 849-862 (2023) - [c47]Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu:
Neuralecho: Hybrid of Full-Band and Sub-Band Recurrent Neural Network For Acoustic Echo Cancellation and Speech Enhancement. ASRU 2023: 1-8 - [c46]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for in-Car Speech Separation. ICASSP 2023: 1-5 - [c45]Ruize Xu, Ruoxuan Feng, Shi-Xiong Zhang, Di Hu:
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning. ICASSP 2023: 1-5 - [c44]Yong Xu, Vinay Kothapally, Meng Yu, Shixiong Zhang, Dong Yu:
Zoneformer: On-device Neural Beamformer For In-car Multi-zone Speech Separation, Enhancement and Echo Cancellation. INTERSPEECH 2023: 5117-5121 - [i37]Rongzhi Gu, Shi-Xiong Zhang, Dong Yu:
3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty. CoRR abs/2302.13462 (2023) - [i36]Ruize Xu, Ruoxuan Feng, Shi-Xiong Zhang, Di Hu:
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning. CoRR abs/2303.05338 (2023) - [i35]Anton Ratnarajah, Shi-Xiong Zhang, Dong Yu:
M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec. CoRR abs/2309.07416 (2023) - [i34]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR. CoRR abs/2311.00146 (2023) - [i33]Yaoxun Xu, Hangting Chen, Jianwei Yu, Qiaochu Huang, Zhiyong Wu, Shi-Xiong Zhang, Guangzhi Li, Yi Luo, Rongzhi Gu:
SECap: Speech Emotion Captioning with Large Language Model. CoRR abs/2312.10381 (2023) - 2022
- [c43]Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu:
Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator. ICASSP 2022: 571-575 - [c42]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. ICASSP 2022: 6067-6071 - [c41]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. ICASSP 2022: 6412-6416 - [c40]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI. ICASSP 2022: 7782-7786 - [c39]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Joint Neural AEC and Beamforming with Double-Talk Detection. INTERSPEECH 2022: 2528-2532 - [c38]Soumi Maiti, Yushi Ueda, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu:
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers. SLT 2022: 480-487 - [i32]Yushi Ueda, Soumi Maiti, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu:
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers. CoRR abs/2203.17068 (2022) - [i31]Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu:
NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement. CoRR abs/2205.10401 (2022) - [i30]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for In-car Speech Separation. CoRR abs/2211.12590 (2022) - [i29]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation. CoRR abs/2212.08348 (2022) - 2021
- [j11]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Complex Neural Spatial Filter: Enhancing Multi-Channel Target Speech Separation in Complex Domain. IEEE Signal Process. Lett. 28: 1370-1374 (2021) - [j10]Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen:
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1368-1396 (2021) - [j9]Jianwei Yu, Shi-Xiong Zhang, Bo Wu, Shansong Liu, Shoukang Hu, Mengzhe Geng, Xunying Liu, Helen Meng, Dong Yu:
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2067-2082 (2021) - [j8]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Donald S. Williamson, Dong Yu:
Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3526-3540 (2021) - [c37]Rongzhi Gu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
3D Spatial Features for Multi-Channel Target Speech Separation. ASRU 2021: 996-1002 - [c36]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation. ICASSP 2021: 6089-6093 - [c35]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. ICASSP 2021: 8433-8437 - [c34]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. Interspeech 2021: 1109-1113 - [c33]Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation. Interspeech 2021: 1119-1123 - [c32]Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. Interspeech 2021: 2142-2146 - [c31]Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation. Interspeech 2021: 3076-3080 - [c30]Saurabh Kataria, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Speaker Verification for Single and Multi-Talker Speech. Interspeech 2021: 4608-4612 - [c29]Jianming Liu, Meng Yu, Yong Xu, Chao Weng, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Neural Mask based Multi-channel Convolutional Beamforming for Joint Dereverberation, Echo Cancellation and Denoising. SLT 2021: 766-770 - [c28]Zhaoheng Ni, Yong Xu, Meng Yu, Bo Wu, Shi-Xiong Zhang, Dong Yu, Michael I. Mandel:
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation. SLT 2021: 817-824 - [i28]Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Generalized RNN beamformer for target speech separation. CoRR abs/2101.01280 (2021) - [i27]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. CoRR abs/2103.16849 (2021) - [i26]Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. CoRR abs/2104.01227 (2021) - [i25]Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation. CoRR abs/2104.08450 (2021) - [i24]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain. CoRR abs/2104.12359 (2021) - [i23]Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu:
FAST-RIR: Fast neural diffuse room impulse response generator. CoRR abs/2110.04057 (2021) - [i22]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Joint AEC AND Beamforming with Double-Talk Detection using RNN-Transformer. CoRR abs/2111.04904 (2021) - [i21]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. CoRR abs/2111.11023 (2021) - [i20]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. CoRR abs/2111.15016 (2021) - [i19]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI. CoRR abs/2112.02498 (2021) - 2020
- [j7]Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu:
Multi-Modal Multi-Channel Target Speech Separation. IEEE J. Sel. Top. Signal Process. 14(3): 530-541 (2020) - [j6]Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
Audio-Visual Speech Separation and Dereverberation With a Two-Stage Multimodal Network. IEEE J. Sel. Top. Signal Process. 14(3): 542-553 (2020) - [c27]Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang:
Self-Supervised Learning for Audio-Visual Speaker Diarization. ICASSP 2020: 4367-4371 - [c26]Jianwei Yu, Shi-Xiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu:
Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset. ICASSP 2020: 6984-6988 - [c25]Aswin Shanmugam Subramanian, Chao Weng, Meng Yu, Shi-Xiong Zhang, Yong Xu, Shinji Watanabe, Dong Yu:
Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives. ICASSP 2020: 7299-7303 - [c24]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning. ICASSP 2020: 7319-7323 - [c23]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. INTERSPEECH 2020: 56-60 - [c22]Shansong Liu, Xurong Xie, Jianwei Yu, Shoukang Hu, Mengzhe Geng, Rongfeng Su, Shi-Xiong Zhang, Xunying Liu, Helen Meng:
Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition. INTERSPEECH 2020: 711-715 - [c21]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Recognition of Overlapped Speech. INTERSPEECH 2020: 3496-3500 - [i18]Jianwei Yu, Shi-Xiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu:
Audio-visual Recognition of Overlapped speech for the LRS2 dataset. CoRR abs/2001.01656 (2020) - [i17]Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang:
Self-supervised learning for audio-visual speaker diarization. CoRR abs/2002.05314 (2020) - [i16]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning. CoRR abs/2003.03927 (2020) - [i15]Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu:
Multi-modal Multi-channel Target Speech Separation. CoRR abs/2003.07032 (2020) - [i14]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. CoRR abs/2005.03889 (2020) - [i13]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-visual Multi-channel Recognition of Overlapped Speech. CoRR abs/2005.08571 (2020) - [i12]Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen:
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. CoRR abs/2008.09586 (2020) - [i11]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. CoRR abs/2011.00091 (2020) - [i10]Zhaoheng Ni, Yong Xu, Meng Yu, Bo Wu, Shi-Xiong Zhang, Dong Yu, Michael I. Mandel:
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation. CoRR abs/2011.09162 (2020) - [i9]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Donald S. Williamson, Dong Yu:
Multi-channel Multi-frame ADL-MVDR for Target Speech Separation. CoRR abs/2012.13442 (2020)
2010 – 2019
- 2019
- [c20]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Time Domain Audio Visual Speech Separation. ASRU 2019: 667-673 - [c19]Shi-Xiong Zhang, Yifan Gong, Dong Yu:
Encrypted Speech Recognition Using Deep Polynomial Networks. ICASSP 2019: 5691-5695 - [c18]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Improved Speaker-Dependent Separation for CHiME-5 Challenge. INTERSPEECH 2019: 466-470 - [c17]Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information. INTERSPEECH 2019: 4290-4294 - [c16]Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu:
A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation. INTERSPEECH 2019: 4574-4578 - [i8]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Time Domain Audio Visual Speech Separation. CoRR abs/1904.03760 (2019) - [i7]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Improved Speaker-Dependent Separation for CHiME-5 Challenge. CoRR abs/1904.03792 (2019) - [i6]Shi-Xiong Zhang, Yifan Gong, Dong Yu:
Encrypted Speech Recognition using Deep Polynomial Networks. CoRR abs/1905.05605 (2019) - [i5]Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
End-to-End Multi-Channel Speech Separation. CoRR abs/1905.06286 (2019) - [i4]Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu:
A comprehensive study of speech separation: spectrogram vs waveform separation. CoRR abs/1905.07497 (2019) - [i3]Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network. CoRR abs/1909.07352 (2019) - [i2]Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu:
A Unified Framework for Speech Separation. CoRR abs/1912.07814 (2019) - 2018
- [c15]Liping Chen, Yong Zhao, Shi-Xiong Zhang, Jie Li, Guoli Ye, Frank K. Soong:
Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification. ICASSP 2018: 5364-5368 - [c14]Yong Zhao, Jinyu Li, Shi-Xiong Zhang, Liping Chen, Yifan Gong:
Domain and Speaker Adaptation for Cortana Speech Recognition. ICASSP 2018: 5984-5988 - 2017
- [p1]Yifan Gong, Yan Huang, Kshitiz Kumar, Jinyu Li, Chaojun Liu, Guoli Ye, Shi-Xiong Zhang, Yong Zhao, Rui Zhao:
Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 401-417 - [i1]Shi-Xiong Zhang, Zhuo Chen, Yong Zhao, Jinyu Li, Yifan Gong:
End-to-End Attention based Text-Dependent Speaker Verification. CoRR abs/1701.00562 (2017) - 2016
- [c13]Yajie Miao, Jinyu Li, Yongqiang Wang, Shi-Xiong Zhang, Yifan Gong:
Simplifying long short-term memory acoustic models for fast training and decoding. ICASSP 2016: 2284-2288 - [c12]Shi-Xiong Zhang, Rui Zhao, Chaojun Liu, Jinyu Li, Yifan Gong:
Recurrent support vector machines for speech recognition. ICASSP 2016: 5885-5889 - [c11]Shi-Xiong Zhang, Zhuo Chen, Yong Zhao, Jinyu Li, Yifan Gong:
End-to-End attention based text-dependent speaker verification. SLT 2016: 171-178 - 2015
- [c10]Shi-Xiong Zhang, Chaojun Liu, Kaisheng Yao, Yifan Gong:
Deep neural support vector machines for speech recognition. ICASSP 2015: 4275-4279 - 2014
- [c9]Jingzhou Yang, Rogier C. van Dalen, Shi-Xiong Zhang, Mark J. F. Gales:
Infinite structured support vector machines for speech recognition. ICASSP 2014: 3320-3324 - 2013
- [j5]Shi-Xiong Zhang, Mark J. F. Gales:
Structured SVMs for Automatic Speech Recognition. IEEE Trans. Speech Audio Process. 21(3): 544-555 (2013) - [c8]Kate M. Knill, Mark J. F. Gales, Shakti P. Rath, Philip C. Woodland, Chao Zhang, Shi-Xiong Zhang:
Investigation of multilingual deep neural networks for spoken term detection. ASRU 2013: 138-143 - [c7]Shi-Xiong Zhang, Mark J. F. Gales:
Kernelized log linear models for continuous speech recognition. ICASSP 2013: 6950-6954 - 2011
- [j4]Shi-Xiong Zhang, Man-Wai Mak:
Optimized Discriminative Kernel for SVM Scoring and Its Application to Speaker Verification. IEEE Trans. Neural Networks 22(2): 173-185 (2011) - [c6]Shi-Xiong Zhang, Mark J. F. Gales:
Extending noise robust structured support vector machines to larger vocabulary tasks. ASRU 2011: 18-23 - [c5]Shi-Xiong Zhang, Mark J. F. Gales:
Structured Support Vector Machines for Noise Robust Continuous Speech Recognition. INTERSPEECH 2011: 989-990 - 2010
- [j3]Shi-Xiong Zhang, Anton Ragni, Mark J. F. Gales:
Structured Log Linear Models for Noise Robust Speech Recognition. IEEE Signal Process. Lett. 17(11): 945-948 (2010)
2000 – 2009
- 2009
- [j2]Shi-Xiong Zhang, Man-Wai Mak:
A new adaptation approach to high-level speaker-model creation in speaker verification. Speech Commun. 51(6): 534-550 (2009) - [c4]Shi-Xiong Zhang, Man-Wai Mak:
Optimization of discriminative kernels in SVM speaker verification. INTERSPEECH 2009: 1275-1278 - 2008
- [c3]Shi-Xiong Zhang, Man-Wai Mak:
High-level speaker verification via articulatory-feature based sequence kernels and SVM. INTERSPEECH 2008: 1393-1396 - 2007
- [j1]Shi-Xiong Zhang, Man-Wai Mak, Helen M. Meng:
Speaker Verification via High-Level Feature Based Phonetic-Class Pronunciation Modeling. IEEE Trans. Computers 56(9): 1189-1198 (2007) - [c2]Shi-Xiong Zhang, Man-Wai Mak, Helen M. Meng:
High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling. INTERSPEECH 2007: 762-765 - [c1]Shi-Xiong Zhang, Man-Wai Mak:
A New Adaptation Method for Speaker-Model Creation in High-Level Speaker Verification. PCM 2007: 325-335
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-27 20:26 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint