default search action
ASRU 2013: Olomouc, Czech Republic
- 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, December 8-12, 2013. IEEE 2013, ISBN 978-1-4799-2756-2
LM: Language Modeling
- Yangyang Shi, Martha A. Larson, Catholijn M. Jonker:
K-component recurrent neural network language models using curriculum learning. 1-6 - Matti Varjokallio, Mikko Kurimo, Sami Virpioja:
Learning a subword vocabulary based on unigram likelihood. 7-12 - Berlin Chen, Yi-Wen Chen, Kuan-Yu Chen, Ea-Ee Jan:
Effective pseudo-relevance feedback for language modeling in speech recognition. 13-18 - Long Qin, Alexander I. Rudnicky:
Learning better lexical properties for recurrent OOV words. 19-24 - Abhinav Sethy, Stanley F. Chen, Ebru Arisoy, Bhuvana Ramabhadran, Kartik Audhkhasi, Shrikanth S. Narayanan, Paul Vozila:
Joint training of interpolated exponential n-gram models. 25-30 - Hasim Sak, Cyril Allauzen, Kaisuke Nakajima, Françoise Beaufays:
Mixture of mixture n-gram language models. 31-36
AM: Acoustic Modeling
- Wen-Lin Zhang, Bi-Cheng Li, Wei-Qiang Zhang:
Compact acoustic modeling based on acoustic manifold using a mixture of factor analyzers. 37-42 - Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A generalized discriminative training framework for system combination. 43-48 - Vimal Manohar, Srinivas C. Bhargav, Srinivasan Umesh:
Acoustic modeling using transform-based phone-cluster adaptive training. 49-54 - George Saon, Hagen Soltau, David Nahamoo, Michael Picheny:
Speaker adaptation of neural network acoustic models using i-vectors. 55-59 - Udhyakumar Nallasamy, Mark C. Fuhs, Monika Woszczyna, Florian Metze, Tanja Schultz:
Neighbour selection and adaptation for rapid speaker-dependent ASR. 60-65
Dec: Decoder Search
- David Nolden, Ralf Schlüter, Hermann Ney:
Efficient nearly error-less LVCSR decoding based on incremental forward and backward passes. 66-71
SLU: Spoken Language Understanding
- Jingjing Liu, Panupong Pasupat, Yining Wang, Scott Cyphers, James R. Glass:
Query understanding enhanced by hierarchical parsing structures. 72-77 - Puyang Xu, Ruhi Sarikaya:
Convolutional neural network based triangular CRF for joint intent detection and slot filling. 78-83 - Jan Svec, Pavel Ircing, Lubos Smídl:
Semantic entity detection from multiple ASR hypotheses within the WFST framework. 84-89 - Ali Orkan Bayer, Giuseppe Riccardi:
On-line adaptation of semantic models for spoken language understanding. 90-95 - Juraj Pálfy, Sakhia Darjaa, Jiri Pospichal:
Dysfluent speech detection by image forensics techniques. 96-101 - Heriberto Cuayáhuitl, Nina Dethlefs, Helen Wright Hastie, Oliver Lemon:
Barge-in effects in Bayesian dialogue act recognition and simulation. 102-107
Dial: Spoken Dialog Systems
- Emmanuel Ferreira, Fabrice Lefèvre:
Expert-based reward shaping and exploration scheme for boosting policy learning of dialogue management. 108-113 - Takuya Hiraoka, Yuki Yamauchi, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura:
Dialogue management for leading the conversation in persuasive dialogue systems. 114-119 - Yun-Nung Chen, William Yang Wang, Alexander I. Rudnicky:
Unsupervised induction and filling of semantic slots for spoken dialogue systems using frame-semantic parsing. 120-125
Multi: Multilingual Speech & Language Processing
- Aanchan Mohan, Richard C. Rose:
Cross-lingual context sharing and parameter-tying for multi-lingual speech recognition. 126-131 - João Miranda, João Paulo da Silva Neto, Alan W. Black:
Improved punctuation recovery through combination of multiple speech streams. 132-137 - Kate M. Knill, Mark J. F. Gales, Shakti P. Rath, Philip C. Woodland, Chao Zhang, Shi-Xiong Zhang:
Investigation of multilingual deep neural networks for spoken term detection. 138-143 - Evgeny A. Stepanov, Ilya Kashkarev, Ali Orkan Bayer, Giuseppe Riccardi, Arindam Ghosh:
Language style and domain adaptation for cross-language SLU porting. 144-149
Robust: Robustness in ASR
- Rongfeng Su, Xunying Liu, Lan Wang:
Automatic model complexity control for generalized variable parameter HMMs. 150-155 - N. Vishnu Prasad, Srinivasan Umesh:
Improved cepstral mean and variance normalization using Bayesian framework. 156-161 - Emmanuel Vincent, Jon Barker, Shinji Watanabe, Jonathan Le Roux, Francesco Nesta, Marco Matassoni:
The second 'CHiME' speech separation and recognition challenge: An overview of challenge systems and outcomes. 162-167 - Antti Hurmalainen, Tuomas Virtanen:
Learning state labels for sparse classification of speech with matrix deconvolution. 168-173 - D. S. Pavan Kumar, N. Vishnu Prasad, Vikas Joshi, Srinivasan Umesh:
Modified splice and its extension to non-stereo data for noise robust speech recognition. 174-179 - Ramón Fernandez Astudillo:
A propagation approach to modelling the joint distributions of clean and corrupted speech in the Mel-Cepstral domain. 180-185 - Soonho Baek, Hong-Goo Kang:
Vector Taylor series based HMM adaptation for generalized cepstrum in noisy environment. 186-191
SDRKWS: Spoken Document Retrieval and Keyword Spotting
- Steven Wegmann, Arlo Faria, Adam Janin, Korbinian Riedhammer, Nelson Morgan:
The TAO of ATWV: Probing the mysteries of keyword search performance. 192-197 - Yun-Chiao Li, Hung-yi Lee, Cheng-Tao Chung, Chun-an Chan, Lin-Shan Lee:
Towards unsupervised semantic retrieval of spoken content with query expansion based on automatically discovered acoustic patterns. 198-203 - Lidia Mangu, Hagen Soltau, Hong-Kwang Kuo, George Saon:
The IBM keyword search system for the DARPA RATS program. 204-209 - Damianos G. Karakos, Richard M. Schwartz, Stavros Tsakalidis, Le Zhang, Shivesh Ranjan, Tim Ng, Roger Hsiao, Guruprasad Saikumar, Ivan Bulyko, Long Nguyen, John Makhoul, Frantisek Grézl, Mirko Hannemann, Martin Karafiát, Igor Szöke, Karel Veselý, Lori Lamel, Viet Bac Le:
Score normalization and system combination for improved keyword spotting. 210-215
NewApp: New Applications of ASR
- Duc Le, Emily Mower Provost:
Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks. 216-221 - Han-Ping Shen, Nobuaki Minematsu, Takehiko Makino, Steven H. Weinberger, Teeraphon Pongkittiphan, Chung-Hsien Wu:
Automatic pronunciation clustering using a World English archive and pronunciation structure analysis. 222-227 - Alexei V. Ivanov, Shahab Jalalvand, Roberto Gretter, Daniele Falavigna:
Phonetic and anthropometric conditioning of MSA-KST cognitive impairment characterization system. 228-233 - Anna Katharina Fuchs, Juan Andres Morales-Cordovilla, Martin Hagmüller:
ASR for electro-laryngeal speech. 234-238 - Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen:
Automatic sentiment extraction from YouTube videos. 239-244
SPFea: Speech Signal Processing and Feature Extraction
- Hideaki Konno, Hideo Kanemitsu, Nobuyuki Takahashi, Mineichi Kudo:
Acoustic characteristics related to the perceptual pitch in whispered vowels. 245-249 - Azzedine Touazi, Mohamed Debyeche:
An SVD-based scheme for MFCC compression in distributed speech recognition system. 250-255 - Reza Sahraeian, Dirk Van Compernolle:
A study of supervised intrinsic spectral analysis for TIMIT phone classification. 256-260 - Florian Metze, Zaid Sheikh, Alex Waibel, Jonas Gehring, Kevin Kilgour, Quoc Bao Nguyen, Van Huy Nguyen:
Models of tone for tonal and non-tonal languages. 261-266
NN: Neural Networks in ASR
- Karel Veselý, Mirko Hannemann, Lukás Burget:
Semi-supervised training of Deep Neural Networks. 267-272 - Alex Graves, Navdeep Jaitly, Abdel-rahman Mohamed:
Hybrid speech recognition with Deep Bidirectional LSTM. 273-278 - Bo Li, Khe Chai Sim:
Improving robustness of deep neural networks via spectral masking for automatic speech recognition. 279-284 - Pawel Swietojanski, Arnab Ghoshal, Steve Renals:
Hybrid acoustic models for distant and multichannel large vocabulary speech recognition. 285-290 - Meng Cai, Yongzhe Shi, Jia Liu:
Deep maxout neural networks for speech recognition. 291-296 - Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran:
Learning filter banks within a deep neural network framework. 297-302 - Tara N. Sainath, Lior Horesh, Brian Kingsbury, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Accelerating Hessian-free optimization for Deep Neural Networks by implicit preconditioning and sampling. 303-308 - Naoyuki Kanda, Ryu Takeda, Yasunari Obuchi:
Elastic spectral distortion for low resource speech recognition with deep neural networks. 309-314 - Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon, Hagen Soltau, Tomás Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improvements to Deep Convolutional Neural Networks for LVCSR. 315-320 - Pierre L. Dognin, Vaibhava Goel:
Combining stochastic average gradient and Hessian-free optimization for sequence training of deep neural networks. 321-325 - Zhiheng Huang, Geoffrey Zweig, Michael Levit, Benoît Dumoulin, Barlas Oguz, Shawn Chang:
Accelerating recurrent neural network training via two stage classes and parallelization. 326-331 - David Imseng, Petr Motlícek, Philip N. Garner, Hervé Bourlard:
Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition. 332-337 - Guangsen Wang, Khe Chai Sim:
Context-dependent modelling of deep neural network using logistic regression. 338-343 - Jonas Gehring, Quoc Bao Nguyen, Florian Metze, Alex Waibel:
DNN acoustic modeling with modular multi-lingual feature extraction networks. 344-349 - Yosuke Kashiwagi, Daisuke Saito, Nobuaki Minematsu, Keikichi Hirose:
Discriminative piecewise linear transformation based on deep learning for noise robust automatic speech recognition. 350-355 - Kris Demuynck, Fabian Triefenbach:
Porting concepts from DNNs back to GMMs. 356-361 - Raymond Brueckner, Björn W. Schuller:
Hierarchical neural networks and enhanced class posteriors for social signal classification. 362-367 - Hank Liao, Erik McDermott, Andrew W. Senior:
Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription. 368-373
LowZero: ASR/Speech Search with Low or Zero Resources
- Liang Lu, Arnab Ghoshal, Steve Renals:
Acoustic data-driven pronunciation lexicon for large vocabulary speech recognition. 374-379 - William Hartmann, Anindya Roy, Lori Lamel, Jean-Luc Gauvain:
Acoustic unit discovery and pronunciation generation from a grapheme-based lexicon. 380-385 - Oliver Walter, Timo Korthals, Reinhold Haeb-Umbach, Bhiksha Raj:
A hierarchical system for word discovery exploiting DTW-based initialization. 386-391 - Bart Ons, Jort F. Gemmeke, Hugo Van hamme:
NMF-based keyword learning from scarce data. 392-397 - Yajie Miao, Florian Metze, Shourabh Rawat:
Deep maxout networks for low-resource speech recognition. 398-403 - Yanmin Qian, Kai Yu, Jia Liu:
Combination of data borrowing strategies for low-resource LVCSR. 404-409 - Keith D. Levin, Katharine Henry, Aren Jansen, Karen Livescu:
Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings. 410-415 - Guoguo Chen, Oguz Yilmaz, Jan Trmal, Daniel Povey, Sanjeev Khudanpur:
Using proxies for OOV keywords in the keyword search task. 416-421 - Fuchun Peng, Scott Roy, Ben Shahshahani, Françoise Beaufays:
Search results based N-best hypothesis rescoring with maximum entropy classification. 422-427 - Ankur Gandhe, Long Qin, Florian Metze, Alexander I. Rudnicky, Ian R. Lane, Matthias Eck:
Using web text to improve keyword spotting in speech. 428-433 - Shilin Liu, Khe Chai Sim:
Multi-stream temporally varying weight regression for cross-lingual speech recognition. 434-439 - Roger Hsiao, Tim Ng, Frantisek Grézl, Damianos G. Karakos, Stavros Tsakalidis, Long Nguyen, Richard M. Schwartz:
Discriminative semi-supervised training for keyword search in low resource languages. 440-445 - Ramya Rasipuram, Marzieh Razavi, Mathew Magimai-Doss:
Probabilistic lexical modeling and unsupervised training for zero-resourced ASR. 446-451 - Joris Driesen, Steve Renals:
Lightly supervised automatic subtitling of weather forecasts. 452-457 - Jahn Heymann, Oliver Walter, Reinhold Haeb-Umbach, Bhiksha Raj:
Unsupervised word segmentation from noisy input. 458-463 - Murat Saraclar, Abhinav Sethy, Bhuvana Ramabhadran, Lidia Mangu, Jia Cui, Xiaodong Cui, Brian Kingsbury, Jonathan Mamou:
An empirical study of confusion modeling in keyword search for low resource languages. 464-469 - Frantisek Grézl, Martin Karafiát:
Semi-supervised bootstrapping approach for neural network feature extractor training. 470-475
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.