default search action
13th ISCSLP 2022: Singapore
- Kong Aik Lee, Hung-yi Lee, Yanfeng Lu, Minghui Dong:
13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022, Singapore, December 11-14, 2022. IEEE 2022, ISBN 979-8-3503-9796-3 - Chao-Han Huck Yang, Jun Qi, Sabato Marco Siniscalchi, Chin-Hui Lee:
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition. 1-5 - Yanbing Yang, Hao Shi, Yuqin Lin, Meng Ge, Longbiao Wang, Qingzhi Hou, Jianwu Dang:
Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition. 6-10 - Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan:
Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models. 11-15 - Song Li, Haoneng Luo, Wenxuan Hu, Yuan Liu, Shiliang Zhang, Lin Li, Qingyang Hong:
Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation. 16-20 - Qingxuan Li, Han Zhu, Liuping Luo, Gaofeng Cheng, Pengyuan Zhang, Jiasong Sun, Yonghong Yan:
Sequence Distribution Matching for Unsupervised Domain Adaptation in ASR. 21-25 - HoLam Chung, Junan Li, Pengfei Liu, Wai-Kim Leung, Xixin Wu, Helen Meng:
Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition. 26-30 - Moyu Chen, Jing Qi, Xiyu Wu:
Perception and production of Mandarin vowels by teenagers-blind and sighted. 31-35 - Jing Lu, Ping Tang:
The Production of Contrastive Focus by Children Learning Mandarin Chinese. 36-40 - Linjiao Pan, Yuan Jia:
Production Characteristics of Vowels in Standard Chinese by Preschool Bilingual Teachers. 41-45 - Chong Cao, Aijun Li:
Effects of Aspiration on Tone Production and Perception in Standard Chinese. 46-50 - Jingwen Cheng, Yingming Gao, Yuchen Yan, Xiaoli Feng, Binghuai Lin, Jinsong Zhang:
The Disyllabic Tone Production and Tone Context Effect in Mandarin-speaking Children with Cochlear Implants. 51-55 - Shuwen Chen:
A preliminary ultrasonic investigation of tenseness in Northern Yi. 56-60 - Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang, Zhongyuan Wang:
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis. 61-65 - Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie, Guoqiao Yu, Guanglu Wan:
Multi-speaker Multi-style Text-to-speech Synthesis with Single-speaker Single-style Training Data Scenarios. 66-70 - Kun Song, Jian Cong, Xinsheng Wang, Yongmao Zhang, Lei Xie, Ning Jiang, Haiying Wu:
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS. 71-75 - Yongmao Zhang, Zhichao Wang, Peiji Yang, Hongshen Sun, Zhisheng Wang, Lei Xie:
AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents. 76-80 - Daxin Tan, Liqun Deng, Nianzu Zheng, Yu Ting Yeung, Xin Jiang, Xiao Chen, Tan Lee:
CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction. 81-85 - Xueyuan Chen, Qiaochu Huang, Xixin Wu, Zhiyong Wu, Helen Meng:
HILvoice:Human-in-the-Loop Style Selection for Elder-Facing Speech Synthesis. 86-90 - Qicong Xie, Shan Yang, Yi Lei, Lei Xie, Dan Su:
End-to-End Voice Conversion with Information Perturbation. 91-95 - Zeqing Zhao, Sifan Ma, Yan Jia, Jingyu Hou, Lin Yang, Junjie Wang:
Mix-Guided VC: Any-to-many Voice Conversion by Combining ASR and TTS Bottleneck Features. 96-100 - Dengfeng Ke, Wenhan Yao, Ruixin Hu, Liangjie Huang, Qi Luo, Wentao Shu:
A New Spoken Language Teaching Tech: Combining Multi-attention and AdaIN for One-shot Cross Language Voice Conversion. 101-104 - Madhu R. Kamble, Hemant A. Patil:
The Impact of Room Acoustics on Replay Speech Signal. 105-109 - Priyanka Gupta, Hemant A. Patil:
Effect of Speaker-Microphone Proximity on Pop Noise: Continuous Wavelet Transform-Based Approach. 110-114 - Lei Wang, Benedict Yeoh, Jun Wah Ng:
Synthetic Voice Detection and Audio Splicing Detection using SE-Res2Net-Conformer Architecture. 115-119 - Zhiping Zeng, Zhizheng Wu:
Audio Splicing Localization: Can We Accurately Locate the Splicing Tampering? 120-124 - Shuai Nie, Shan Liang, Zhanlei Yang, Longshuai Xiao, Wenju Liu, Jianhua Tao:
Masking-based Neural Beamformer for Multichannel Speech Enhancement. 125-129 - Junjie Li, Meng Ge, Longbiao Wang, Jianwu Dang:
Deep Multi-task Cascaded Acoustic Echo Cancellation and Noise Suppression. 130-134 - Chenyi Li, Zhiyong Wu, Wei Rao, Yannan Wang, Helen Meng:
Boosting the Performance of SpEx+ by Attention and Contextual Mechanism. 135-139 - Shangdi Liao, Fei Chen:
Assessing the Effect of Temporal Misalignment between the Probe and Processed Speech Signals on Objective Speech Quality Evaluation. 140-144 - Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao, Hsin-Min Wang:
Speech-enhanced and Noise-aware Networks for Robust Speech Recognition. 145-149 - Yuxiao Lin, Zhihao Du, Shiliang Zhang, Fan Yu, Zhou Zhao, Fei Wu:
Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR. 150-154 - Wen-Yuan Ting, Syu-Siang Wang, Hsin-Li Chang, Borching Su, Yu Tsao:
Speech Enhancement Based on CycleGAN with Noise-informed Training. 155-159 - Meng Li, Yan Xia, Feng Lin:
Incorporating VAD into ASR System by Multi-task Learning. 160-164 - Yen-Lun Liao, Chi-Han Lin, Ren-Yuan Lyu, Jyh-Shing Roger Jang:
Improving ASR in Reverberant Environments. 165-169 - Zhao You, Shulin Feng, Dan Su, Dong Yu:
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition. 170-174 - Yuting Yang, Binbin Du, Yuke Li:
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition. 175-179 - Keyu An, Ji Xiao, Zhijian Ou:
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study. 180-184 - Shu-Fen Tsai, Shih-Chan Kuo, Ren-Yuan Lyu, Jyh-Shing Roger Jang:
Ensemble And Re-Ranking Based On Language Models To Improve ASR. 185-189 - Yue Wang, Wen Liu:
Acoustic and Perceptual Study of Tones in Jin Chinese (Togtoh variety). 190-194 - Min Xu, Jing Shao, Hongwei Ding, Lan Wang:
Acoustic-perceptual correlates of whispered Mandarin consonants. 195-199 - Kimiko Tsukada, Yurong Yurong, Badmaavanchin Munguntsetseg:
Bilingual Advantage? Perception of the Japanese Consonant Length Contrast by Monolingual vs Bilingual Speakers of Mongolian. 200-204 - Ruiqi Ge, Xiyu Wu:
Multichannel Emotional Perception in Chinese Female: Faces, Voices, and Bodies. 205-209 - Yanyang Chen, Xinya Zhang, Ying Chen, Jiazheng Wang:
Coda Nasal Perception in Wenzhou Wu and Rugao Mandarin by Native Speakers of Standard Mandarin. 210-214 - Li Liu, Gang Feng, Xiaoxi Ren, Xianping Ma:
Objective Hand Complexity Comparison between Two Mandarin Chinese Cued Speech Systems. 215-219 - Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang, Binghuai Lin:
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis. 220-224 - Dengfeng Ke, Ruixin Hu, Qi Luo, Liangjie Huang, Wenhan Yao, Wentao Shu, Jinsong Zhang, Yanlu Xie:
AdaptiveFormer: A Few-shot Speaker Adaptive Speech Synthesis Model based on FastSpeech2. 225-229 - Jinlong Xue, Yayue Deng, Yichen Han, Ya Li, Jianqing Sun, Jiaen Liang:
ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis. 230-234 - Li-Jen Yang, I-Ping Yeh, Jen-Tzung Chien:
Low-Resource Speech Synthesis with Speaker-Aware Embedding. 235-239 - Zhijunyi Yang, Mengjie Du, Rongfeng Su, Xiaokang Liu, Nan Yan, Lan Wang:
A Phone-Level Speaker Embedding Extraction Framework with Multi-Gate Mixture-of-Experts Based Multi-Task Learning. 240-244 - Wan Lin, Lantian Li, Dong Wang:
Shuffle is What You Need. 245-249 - Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du, Chin-Hui Lee:
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function. 250-254 - Chenxi Wang, Hang Chen, Jun Du, Baocai Yin, Jia Pan:
Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement. 255-259 - Jiajun Liu, Huazhen Meng, Yunfei Shen, Linna Zheng, Aishan Wumaier:
Multimodal automatic speech fluency evaluation method for Putonghua Proficiency Test propositional speaking section. 260-264 - Raymond Chung:
Cantonese neural speech synthesis from found newscasting video data and its speaker adaptation. 265-269 - Yuan-Fu Liao, Yu-Hsuan Huang, Matús Pleva, Daniel Hládek, Ming-Hsiang Su:
A Preliminary Study on Taiwanese OCR for Assisting Textual Database Construction from Historical Documents. 270-274 - Di Zhou, Masashi Unoki, Gaoyan Zhang, Jianwu Dang:
Reconstruction of speech spectrogram based on non-invasive EEG signal. 275-279 - Binbin Shen, Jian Luan, Shengyan Zhang, Quanbo Shen, Yujun Wang:
J-TranPSP: A Joint Transition-based Model for Prosodic Structure Prediction, Word Segmentation and PoS Tagging. 280-284 - Peiyang Shi, Zengqiang Shang, Pengyuan Zhang:
A Mandarin Prosodic Boundary Prediction Model Based on Multi-Source Semi-Supervision. 285-289 - Mosi He, Ting Zhang, Bin Li, Kin Cheung:
English lexical stresses in non-native speech under adverse conditions. 290-294 - Jingwen Huang, Aijun Li:
Stress Gravity of Neutral Tone Words in Different Information Structures. 295-299 - Tong Li, Hui Feng, Yuan Jia:
Prosodic Encoding of Mandarin Chinese Intonation by Uygur Speakers in Declarative and Interrogative Sentences. 300-304 - Yuhan Yan, Shanpeng Li, Ying Chen:
In-group Advantage for Chinese and English Emotional Prosody in Quiet and Noise Conditions. 305-309 - Jian Tang, Shaofei Xue:
Multi-Resolution Stacked 1D-CNN for Small-Footprint keyword Spotting with Two-Stage Detection. 310-314 - Yao-Ting Wang, Yi-Xing Lin, Kai-Wen Liang, Tzu-Chiang Tai, Jia-Ching Wang:
Lightweight End-To-End Deep Learning Model For Music Source Separation. 315-318 - Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, Lei Xie, Bing Yang, Xiong Zhang, Dan Su:
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation. 319-323 - Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee, Guanglu Wan:
Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition. 324-328 - Bowen Qu, Chenda Li, Jinfeng Bai, Yanmin Qian:
Improving Speech Separation with Knowledge Distilled from Self-supervised Pre-trained Models. 329-333 - Wei Wang, Wangyou Zhang, Shaoxiong Lin, Yanmin Qian:
Text-Informed Knowledge Distillation for Robust Speech Enhancement and Recognition. 334-338 - Jiahao Lu, Bin Liu, Zheng Lian, Cong Cai, Jianhua Tao, Ziping Zhao:
Prediction of Depression Severity Based on Transformer Encoder and CNN Model. 339-343 - Yimin He, Xiaoyong Lu, Jingyi Yuan, Tao Pan, Yafan Wang:
Depressive Tendency Recognition by Fusing Speech and Text Features: A Comparative Analysis. 344-348 - Zhikai Zhou, Shuang Cao, Zhengyang Chen, Bei Liu, Ming Xia, Hong Jiang, Yanmin Qian:
Medical Difficult Airway Detection using Speech Technology. 349-353 - Dehua Tao, Harold Chui, Sarah Luk, Tan Lee:
CUEMPATHY: A Counseling Speech Dataset for Psychotherapy Research. 354-358 - Ying Qin, Tan Lee, Anthony Pak-Hin Kong, Feng Lin:
Aphasia Detection for Cantonese-Speaking and Mandarin-Speaking Patients Using Pre-Trained Language Models. 359-363 - Tinghao Zhao, Xiaoxia Du, Juan Liu, Rongfeng Su, Nan Yan, Lan Wang:
Respiratory and laryngeal influences on voice in post-stroke dysarthria: a pilot study. 364-368 - Tengfei Cao, Liang He, Fangjing Niu:
End-to-end speech topic classification based on pre-trained model Wavlm. 369-373 - Tsung-Hsien Yang, Matús Pleva, Daniel Hládek, Ming-Hsiang Su:
BERT-based Chinese Medicine Named Entity Recognition Model Applied to Medication Reminder Dialogue System. 374-378 - Yuning Liu, Di Zhou, Masashi Unoki, Jianwu Dang, Aijun Li:
Dialogue scenario classification based on social factors. 379-383 - Yuting Nie, Junhong Zhao, Wei-Qiang Zhang, Jinfeng Bai:
BERT-LID: Leveraging BERT to Improve Spoken Language Identification. 384-388 - Rian Bao, Linkai Peng, Yuchen Yan, Jinsong Zhang:
An Exploratory Study for Quantifying the Contextual Information for Successful Chinese L2 Speech Comprehension. 389-393 - Rian Bao, Linkai Peng, Yingming Gao, Jinsong Zhang:
The Contribution of Phonological and Fluency Factors to Chinese L2 Comprehensibility Ratings: A Case Study of Urdu-speaking Learners. 394-398 - Xinyi Zhang, Wen Liu:
An Acoustic Study on Fricative Vowel [iʑ] in Zhongwei Chinese. 399-403 - Yuan Jia, Xintong Zuo:
Acoustic Features of Consonants of Standard Chinese and English by Uyghur Native Speakers. 404-408 - Kaige Gao, Xiyu Wu:
A Study on Mandarin Chinese "Bu" Tone Sandhi Followed by English Words. 409-413 - Xiaoli Feng, Yingming Gao, Jinsong Zhang, Yanchun Cao:
An Entropy-based Study on the Acquisition of Mandarin Initial Consonants by Korean Learners. 414-418 - Yujie Ji, Qiqi Sun, Zhikang Peng, Xiaoming Jiang:
Impacts of aging on suprasegmental and segmental encoding of vocally-expressed confidence in Wuxi dialect. 419-423 - Julie Siying Chen, Stephen Politzer-Ahles:
Acceptance of tonal and segmental variability correlates to inventory size in Mandarin Chinese. 424-427 - Tanmay Khandelwal, Rohan Kumar Das:
Dynamic Thresholding on FixMatch with Weak and Strong Data Augmentations for Sound Event Detection. 428-432 - Aastha Kachhi, Shreya S. Chaturvedi, Hemant A. Patil, Dipesh K. Singh:
Data Augmentation for Infant Cry Classification. 433-437 - Yikang Wang, Xingming Wang, Hiromitsu Nishizaki, Ming Li:
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities. 438-442 - Shaofei Xue, Jian Tang, Yazhu Liu:
Improving Speech Recognition with Augmented Synthesized Data and Conditional Model Training. 443-447 - Houjun Huang, Yanmin Qian:
Speaking style compensation on synthetic audio for robust keyword spotting. 448-452 - Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee:
A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification. 453-457 - Rohith Mars, Rohan Kumar Das:
On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement. 458-462 - Tailong Zhang, Shulin He, Hao Li, Xueliang Zhang:
RAT: RNN-Attention Transformer for Speech Enhancement. 463-467 - Weitong Zhao, Fushi Xie, Kang Ouyang, Nengheng Zheng:
A Speech-Noise-Equilibrium Loss Function for Deep Learning-Based Speech Enhancement. 468-472 - Shulin He, Hao Li, Xueliang Zhang:
Speakerfilter-Pro: an improved target speaker extractor combines the time domain and frequency domain. 473-477 - Hui Li, Zhihua Huang, Chuangjian Guo:
Two-Branch Network with Selective Kernel Convolution for Time-Domain Speech Enhancement. 478-482 - Guochen Yu, Andong Li, Wenzhe Liu, Chengshi Zheng, Yutian Wang, Hui Wang:
Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement. 483-487 - Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan:
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines. 488-492 - Kai Li:
Spectral Clustering Based EEND-vector Clustering: A Robust System Fine-tuned on Simulated Conversations. 493-497 - Tao Liu, Xu Xiang, Zhengyang Chen, Bing Han, Kai Yu, Yanmin Qian:
The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022. 498-501 - Bowen Pang, Huan Zhao, Gaosheng Zhang, Xiaoyue Yang, Yang Sun, Li Zhang, Qing Wang, Lei Xie:
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge. 502-506 - Ao Zhang, Fan Yu, Kaixun Huang, Lei Xie, Longbiao Wang, Eng Siong Chng, Hui Bu, Binbin Zhang, Wei Chen, Xin Xu:
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results. 507-511 - Yujia Sun, Bing Ge, Bo Chen, Zhen Fu, Jinxin He, Hongwei Gao, Xue Wang:
The FawAI ASR System for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge. 512-516 - Yan Jia, Mi Hong, Jingyu Hou, Kailong Ren, Sifan Ma, Jin Wang, Yinglin Ji, Fangzhen Peng, Lin Yang, Junjie Wang:
LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge. 517-521 - Hanzhi Guo, Yunshu Chen, Xukang Xie, Gaopeng Xu, Wei Guo:
Efficient Conformer-Based CTC Model for Intelligent Cockpit Speech Recognition. 522-526 - Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Summary On The ISCSLP 2022 Chinese-English Code-Switching ASR Challenge. 527-531 - Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Yingying Gao, Lei Xie:
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge. 532-536 - Hengxin Yin, Guangyu Hu, Fei Wang, Pengfei Ren:
Hybrid CTC Language Identification Structure for Mandarin-English Code-Switching ASR. 537-541
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.