default search action
IEEE Transactions on Audio, Speech & Language Processing, Volume 17
Volume 17, Number 1, January 2009
- Serdar Yildirim, Shrikanth S. Narayanan:
Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information. 2-12 - Abhinav Sethy, Panayiotis G. Georgiou, Bhuvana Ramabhadran, Shrikanth S. Narayanan:
An Iterative Relative Entropy Minimization-Based Data Selection Approach for n-Gram Model Adaptation. 13-23 - Jiucang Hao, Hagai Attias, Srikantan S. Nagarajan, Te-Won Lee, Terrence J. Sejnowski:
Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation. 24-37 - Simon Doclo, Marc Moonen, Tim Van den Bogaert, Jan Wouters:
Reduced-Bandwidth and Distributed MWF-Based Noise Reduction Algorithms for Binaural Hearing Aids. 38-51 - Yoshifumi Nagata, Satoshi Iwasaki, Takahiko Hariyama, Toyota Fujioka, Tomita Obara, Takayuki Wakatake, Masato Abe:
Binaural Localization Based on Weighted Wiener Gain Improved by Incremental Source Attenuation. 52-65 - Junichi Yamagishi, Takao Kobayashi, Yuji Nakano, Katsumi Ogata, Juri Isogai:
Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm. 66-83 - Shih-Hsiang Lin, Berlin Chen, Yao-Ming Yeh:
Exploring the Use of Speech Features and Their Corresponding Distribution Characteristics for Robust Speech Recognition. 84-94 - Yi-Ting Chen, Berlin Chen, Hsin-Min Wang:
A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization. 95-106 - Yan Jennifer Wu, Thushara D. Abhayapala:
Theory and Design of Soundfield Reproduction Using Continuous Loudspeaker Concept. 107-116 - Radoslaw Mazur, Alfred Mertins:
An Approach for Solving the Permutation Problem of Convolutive Blind Source Separation Based on Statistical Signal Models. 117-126 - Chung-Hsien Wu, Chung-Han Lee, Chung-Hau Liang:
Idiolect Extraction and Generation for Personalized Speaking Style Modeling. 127-137 - Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan:
Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition. 138-149 - René Martinus Maria Derkx, K. Janse:
Theoretical Analysis of a First-Order Azimuth-Steerable Superdirective Microphone Array. 150-162 - Ebru Arisoy, Murat Saraclar:
Lattice Extension and Vocabulary Adaptation for Turkish LVCSR. 163-173 - Cyril Joder, Slim Essid, Gaël Richard:
Temporal Integration for Audio Classification With Application to Musical Instrument Classification. 174-186 - Man-Hung Siu, Xi Yang, Herbert Gish:
Discriminatively Trained GMMs for Language Classification Using Boosting Methods. 187-197
Volume 17, Number 2, February 2009
- Chang-Wen Hsu, Lin-Shan Lee:
Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition. 205-220 - Rade Kutil:
Optimized Sinusoid Synthesis via Inverse Truncated Fourier Transform. 221-230 - Takuya Yoshioka, Tomohiro Nakatani, Masato Miyoshi:
Integrated Speech Enhancement Method Using Noise Suppression and Dereverberation. 231-246 - György Wersényi:
Effect of Emulated Head-Tracking for Reducing Localization Errors in Virtual Audio Simulation. 247-252 - P. Krishnamoorthy, S. Prasanna:
Reverberant Speech Enhancement by Temporal and Spectral Processing. 253-266 - Jen-Tzung Chien, Meng-Sung Wu:
Minimum Rank Error Language Modeling. 267-276 - Prem C. Pandey, Milind S. Shah:
Estimation of Place of Articulation During Stop Closures of Vowel-Consonant-Vowel Utterances. 277-286 - George Almpanidis, Margarita Kotti, Constantine Kotropoulos:
Robust Detection of Phone Boundaries Using Model Selection Criteria With Few Observations. 287-298 - Leandro E. Di Persia, Diego H. Milone, Masuzo Yanagida:
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources. 299-311 - Matthias Wölfel:
Enhanced Speech Features by Single-Channel Joint Compensation of Noise and Reverberation. 312-323 - Marc Delcroix, Tomohiro Nakatani, Shinji Watanabe:
Static and Dynamic Variance Compensation for Recognition of Reverberant Speech With Dereverberation Preprocessing. 324-334 - François Pachet, Pierre Roy:
Improving Multilabel Analysis of Music Titles: A Large-Scale Validation of the Correction Approach. 335-343 - Rahim Saeidi, H. R. S. Mohammadi, Todor Ganchev, Robert D. Rodman:
Particle Swarm Optimization for Sorted Adapted Gaussian Mixture Models. 344-353 - Yasser Hifny, Steve Renals:
Speech Recognition Using Augmented Conditional Random Fields. 354-365 - John H. L. Hansen, Vaishnevi S. Varadarajan:
Analysis and Compensation of Lombard Speech Across Noise Type and Levels With Application to In-Set/Out-of-Set Speaker Recognition. 366-378 - Shaminda Subasingha, Manohar N. Murthi, Søren Vang Andersen:
Gaussian Mixture Kalman Predictive Coding of Line Spectral Frequencies. 379-391 - Jacek Dmochowski, Jacob Benesty, Sofiène Affes:
An Information-Theoretic Viewof ArrayProcessing. 392-401
Volume 17, Number 3, March 2009
- Athanassios Katsamanis, George Papandreou, Petros Maragos:
Face Active Appearance Modeling and Speech Acoustic Information to Recover Articulation. 411-422 - George Papandreou, Athanassios Katsamanis, Vassilis Pitsikalis, Petros Maragos:
Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition. 423-435 - Eduardo Sánchez-Soto, Alexandros Potamianos, Khalid Daoudi:
Unsupervised Stream-Weights Computation in Classification and Recognition Tasks. 436-445 - Jon Barker, Xu Shao:
Energetic and Informational Masking Effects in an Audiovisual Speech Recognition System. 446-458 - Javier Melenchón, Elisa Martínez, Fernando De la Torre, José Antonio Montero:
Emphatic Visual Speech Synthesis. 459-468 - Jianhua Tao, Le Xin, Panrong Yin:
Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method. 469-477 - Peng Liu, Frank K. Soong:
Graph-Based Partial Hypothesis Fusion for Pen-Aided Speech Input. 478-485 - Pui-Yu Hui, Helen M. Meng:
Cross-Modality Semantic Integration With Hypothesis Rescoring for Robust Interpretation of Multimodal User Interactions. 486-500 - Dinesh Babu Jayagopi, Hayley Hung, Chuohao Yeo, Daniel Gatica-Perez:
Modeling Dominance in Group Conversations Using Nonverbal Activity Cues. 501-513
Volume 17, Number 4, May 2009
- A. Homayoun Kamkar-Parsi, Martin Bouchard:
Improved Noise Power Spectrum Density Estimation for Binaural Hearing Aids Operating in a Diffuse Noise Field Environment. 521-533 - Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi:
Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction. 534-545 - Ronen Talmon, Israel Cohen, Sharon Gannot:
Relative Transfer Function Identification Using Convolutive Transfer Function Approximation. 546-555 - S. R. Mahadeva Prasanna, B. V. Sandeep Reddy, P. Krishnamoorthy:
Vowel Onset Point Detection Using Source, Spectral Peaks, and Modulation Spectrum Energies. 556-565 - Liang Wang, Woon-Seng Gan:
Convergence Analysis of Narrowband Active Noise Equalizer System Under Imperfect Secondary Path Estimation. 566-571 - Leonardo Rey Vega, Hernan Rey, Jacob Benesty, Sara Tressens:
A Family of Robust Algorithms Exploiting Sparsity in Adaptive Filters. 572-581 - Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan:
Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection. 582-596 - Martin Kuster:
Multichannel Room Impulse Response Generation With Coherence Control. 597-606 - Satyabrata Sen, Arye Nehorai:
Performance Analysis of 3-D Direction Estimation Based on Head-Related Transfer Function. 607-613 - B. Yegnanarayana, K. Sri Rama Murty:
Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals. 614-624 - Zhaozhang Jin, DeLiang Wang:
A Supervised Learning Approach to Monaural Segregation of Reverberant Speech. 625-638 - Hiroko Kato Solvang, Yuichi Nagahara, Shoko Araki, Hiroshi Sawada, Shoji Makino:
Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation. 639-649 - Yu Takahashi, Tomoya Takatani, Keiichi Osako, Hiroshi Saruwatari, Kiyohiro Shikano:
Blind Spatial Subtraction Array for Speech Enhancement in Noisy Environment. 650-664 - Huawei Chen, Wee Ser:
Design of Robust Broadband Beamformers With Passband Shaping Characteristics Using Tikhonov Regularization. 665-681 - Jwu-Sheng Hu, Wei-Han Liu:
Location Classification of Nonstationary Sound Sources Using Binaural Room Distribution Patterns. 682-692 - Jesper Højvang Jensen, Mads Græsbøll Christensen, Daniel P. W. Ellis, Søren Holdt Jensen:
Quantitative Analysis of a Common Audio Similarity Measure. 693-703 - Hong Kook Kim, Richard C. Rose:
Cepstrum-Domain Model Combination Based on Decomposition of Speech and Noise Using MMSE-LSA for ASR in Noisy Environments. 704-713 - Kai Yu, Mark J. F. Gales, Philip C. Woodland:
Unsupervised Adaptation With Discriminative Mapping Transforms. 714-723 - Teemu Hirsimäki, Janne Pylkkönen, Mikko Kurimo:
Importance of High-Order N-Gram Models in Morph-Based Speech Recognition. 724-732 - Jost Schatzmann, Steve J. Young:
The Hidden Agenda User Simulation Model. 733-747 - Chris Longworth, Mark J. F. Gales:
Combining Derivative and Parametric Kernels for Speaker Verification. 748-757 - Nicolás Morales, Doroteo Torre Toledano, John H. L. Hansen, Javier Garrido Salas:
Feature Compensation Techniques for ASR on Band-Limited Speech. 758-774 - Yannis Agiomyrgiannakis, Yannis Stylianou:
Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech. 775-786 - Jingdong Chen, Jacob Benesty, Yiteng Huang:
Study of the Noise-Reduction Problem in the Karhunen-LoÈve Expansion Domain. 787-802 - Klaus Macherey, Oliver Bender, Hermann Ney:
Applications of Statistical Machine Translation Approaches to Spoken Language Understanding. 803-818 - Wen Zhang, Rodney A. Kennedy, Thushara D. Abhayapala:
Efficient Continuous HRTF Model Using Data Independent Basis Functions: Experimentally Guided Approach. 819-829 - Malay Gupta, Scott C. Douglas:
A Spatio-Temporal Speech Enhancement Technique Based on Generalized Eigenvalue Decomposition. 830-839 - Pavel Ircing, Josef V. Psutka, Josef Psutka:
Using Morphological Information for Robust Language Modeling in Czech ASR System. 840-847 - Vijendra Raj Apsingekar, Phillip L. De Leon:
Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications. 848-853
Volume 17, Number 5, July 2009
- Ruhi Sarikaya, Katrin Kirchhoff, Tanja Schultz, Dilek Hakkani-Tür:
Introduction to the Special Issue on Processing Morphologically Rich Languages. 861-862 - Thomas Pellegrini, Lori Lamel:
Automatic Word Decompounding for ASR in a Morphologically Rich Language: Application to Amharic. 863-873 - Ebru Arisoy, Dogan Can, Siddika Parlak, Hasim Sak, Murat Saraclar:
Turkish Broadcast News Transcription and Retrieval. 874-883 - Hagen Soltau, George Saon, Brian Kingsbury, Hong-Kwang Jeff Kuo, Lidia Mangu, Daniel Povey, Ahmad Emami:
Advances in Arabic Speech Transcription at IBM Under the DARPA GALE Program. 884-894 - Ümit Güz, Benoît Favre, Dilek Hakkani-Tür, Gökhan Tür:
Generative and Discriminative Methods Using Morphological Information for Sentence Segmentation of Turkish. 895-903 - Xabier Artola, Arantza Díaz de Ilarraza Sánchez, Aitor Soroa, Aitor Sologaistoa:
Dealing With Complex Linguistic Annotations Within a Language Processing Framework. 904-915 - Mohamed Attia, Mohsen A. Rashwan, Mohamed Al-Badrashiny:
Fassieh-, a Semi-Automatic Visual Interactive Tool for Morphological, PoS-Tags, Phonetic, and Semantic Annotation of Arabic Text Corpora. 916-925 - Yassine Benajiba, Mona T. Diab, Paolo Rosso:
Arabic Named Entity Recognition: A Feature-Driven Study. 926-934 - Imed Zitouni, Xiaoqiang Luo, Radu Florian:
A Cascaded Approach to Mention Detection and Chaining in Arabic. 935-944 - Do-Gil Lee, Hae-Chang Rim:
Probabilistic Modeling of Korean Morphology. 945-955 - Kseniya B. Shalonova, Bruno Golénia, Peter A. Flach:
Towards Learning Morphology for Under-Resourced Fusional and Agglutinating Languages. 956-965 - Paris Smaragdis:
Dynamic Range Extension Using Interleaved Gains. 966-973 - Stefan Windmann, Reinhold Haeb-Umbach:
Approaches to Iterative Speech Feature Enhancement and Recognition. 974-984 - Gerald Friedland, Oriol Vinyals, Yan Huang, Christian A. Müller:
Prosodic and other Long-Term Features for Speaker Diarization. 985-993 - Ken'ichi Kumatani, John W. McDonough, Barbara Rauch, Dietrich Klakow, Philip N. Garner, Weifeng Li:
Beamforming With a Maximum Negentropy Criterion. 994-1008 - Ozlem Kalinli, Shrikanth S. Narayanan:
Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information. 1009-1024 - Yu Tsao, Chin-Hui Lee:
An Ensemble Speaker and Speaking Environment Modeling Approach to Robust Speech Recognition. 1025-1037 - Ali A. Milani, Issa M. S. Panahi, Philipos C. Loizou:
A New Delayless Subband Adaptive Filtering Algorithm for Active Noise Control Systems. 1038-1045 - Arie Livshin, Xavier Rodet:
Purging Musical Instrument Sample Databases Using Automatic Musical Instrument Recognition Methods. 1046-1051
Volume 17, Number 6, August 2009
- Nikolay D. Gaubitch, Patrick A. Naylor:
Equalization of Multichannel Acoustic Systems in Oversampled Subbands. 1061-1070 - Shmulik Markovich, Sharon Gannot, Israel Cohen:
Multichannel Eigenspace Beamforming in a Reverberant Noisy Environment With Multiple Interfering Speech Signals. 1071-1086 - Jerónimo Arenas-García, Aníbal R. Figueiras-Vidal:
Adaptive Combination of Proportionate Filters for Sparse Echo Cancellation. 1087-1098 - Tiemin Mei, Fuliang Yin, Jun Wang:
Blind Source Separation Based on Cumulants With Time and Frequency Non-Properties. 1099-1108 - Jacob Benesty, Jingdong Chen, Yiteng Arden Huang:
Noise Reduction Algorithms in a Generalized Transform Domain. 1109-1123 - Tianshu Qu, Zheng Xiao, Mei Gong, Ying Huang, Xiaodong Li, Xihong Wu:
Distance-Dependent Head-Related Transfer Functions Measured With High Spatial Resolution Using a Spark Gap. 1124-1132 - Nima Khademi Kalantari, Mohammad Ali Akhaee, Seyed Mohammad Ahadi, Hamidreza Amindavar:
Robust Multiplicative Patchwork Method for Audio Watermarking. 1133-1141 - Selina Chu, Shrikanth S. Narayanan, C.-C. Jay Kuo:
Environmental Sound Recognition With Time-Frequency Audio Features. 1142-1158 - Jouni Paulus, Anssi Klapuri:
Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm. 1159-1170 - Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, Ren-Hua Wang:
Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis. 1171-1185 - Patricia Henríquez, Jesús B. Alonso, Miguel A. Ferrer, Carlos M. Travieso, Juan Ignacio Godino-Llorente, Fernando Díaz-de-María:
Characterization of Healthy and Pathological Voice Through Measures Based on Nonlinear Dynamics. 1186-1195 - B. Yegnanarayana, R. Kumaraswamy, K. Sri Rama Murty:
Determining Mixing Parameters From Multispeaker Data Using Speech-Specific Information. 1196-1207 - Junichi Yamagishi, Takashi Nose, Heiga Zen, Zhen-Hua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, Steve Renals:
Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis. 1208-1230 - Yao Qian, Hui Liang, Frank K. Soong:
A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS. 1231-1239
Volume 17, Number 7, September 2009
- Mei-Yuh Hwang, Gang Peng, Mari Ostendorf, Wen Wang, Arlo Faria, Aaron Heidel:
Building A Highly Accurate Mandarin Speech Recognizer With Language-Independent Technologies and Language-Dependent Modules. 1253-1262 - Che-Kuang Lin, Lin-Shan Lee:
Improved Features and Models for Detecting Edit Disfluencies in Transcribing Spontaneous Mandarin Speech. 1263-1278 - Jen-Tzung Chien, Chuan-Wei Ting:
Acoustic Factor Analysis for Streamed Hidden Markov Modeling. 1279-1291 - Wooil Kim, John H. L. Hansen:
Time-Frequency Correlation-Based Missing-Feature Reconstruction for Robust Speech Recognition in Band-Restricted Conditions. 1292-1304 - Hung-Yu Su, Chung-Hsien Wu:
Improving Structural Statistical Machine Translation for Sign Language With Small Corpus Using Thematic Role Templates as Translation Memory. 1305-1315 - Engin Erzin:
Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings. 1316-1324 - Mohamed Afify, Xiaodong Cui, Yuqing Gao:
Stereo-Based Stochastic Mapping for Robust Speech Recognition. 1325-1334 - Rong Tong, Bin Ma, Haizhou Li, Chng Eng Siong:
A Target-Oriented Phonotactic Front-End for Spoken Language Recognition. 1335-1347 - Dong Yu, Li Deng, Yifan Gong, Alex Acero:
A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models. 1348-1360 - Yipeng Li, John Woodruff, DeLiang Wang:
Monaural Musical Sound Separation Based on Pitch and Common Amplitude Modulation. 1361-1371 - Brian Kan-Wing Mak, Tsz-Chung Lai, Ivor W. Tsang, James Tin-Yau Kwok:
Maximum Penalized Likelihood Kernel Regression for Fast Adaptation. 1372-1381 - Deepu Vijayasenan, Fabio Valente, Hervé Bourlard:
An Information Theoretic Approach to Speaker Diarization of Meeting Data. 1382-1393 - Nitish Krishnamurthy, John H. L. Hansen:
Babble Noise: Modeling, Analysis, and Applications. 1394-1407 - Giso Grimm, Volker Hohmann, Birger Kollmeier:
Increase and Subjective Evaluation of Feedback Stability in Hearing Aids by a Binaural Coherence-Based Noise Reduction Scheme. 1408-1419 - Ronen Talmon, Israel Cohen, Sharon Gannot:
Convolutive Transfer Function Generalized Sidelobe Canceler. 1420-1434 - Jayme Garcia Arnal Barbedo, Amauri Lopes, Patrick J. Wolfe:
Empirical Methods to Determine the Number of Sources in Single-Channel Musical Signals. 1435-1444
Volume 17, Number 8, November 2009
- Aren Jansen, Partha Niyogi:
Point Process Models for Spotting Keywords in Continuous Speech. 1457-1470 - Viet Bac Le, Laurent Besacier:
Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language. 1471-1482 - Christos Tzagkarakis, Athanasios Mouchtaris, Panagiotis Tsakalides:
A Multichannel Sinusoidal Model Applied to Spot Microphone Signals for Immersive Audio. 1483-1497 - Sampo Vesa:
Binaural Sound Source Distance Learning in Rooms. 1498-1507 - Ioannis Andrianakis, Paul R. White:
A Speech Enhancement Algorithm Based on a Chi MRF Model of the Speech STFT Amplitudes. 1508-1517 - Emre Özkan, I. Yücel Özbek, Mübeccel Demirekler:
Dynamic Speech Spectrum Representation and Tracking Variable Number of Vocal Tract Resonance Frequencies With Time-Varying Dirichlet Process Mixture Models. 1518-1532 - Ilana Heintz, Eric Fosler-Lussier, Chris Brew:
Discriminative Input Stream Combination for Conditional Random Field Phone Recognition. 1533-1546 - Zhi-Sheng Chen, Jyh-Shing Roger Jang:
On the Use of Anti-Word Models for Audio Music Annotation and Retrieval. 1547-1556 - Mark R. P. Thomas, Patrick A. Naylor:
The SIGMA Algorithm: A Glottal Activity Detector for Electroglottographic Signals. 1557-1566 - Zhiyong Wu, Helen M. Meng, Hongwu Yang, Lianhong Cai:
Modeling the Expressivity of Input Text Semantics for Chinese Text-to-Speech Synthesis in a Spoken Dialog System. 1567-1576 - Stefan Windmann, Reinhold Haeb-Umbach:
Parameter Estimation of a State-Space Model of Noise for Robust Speech Recognition. 1577-1590 - Pradeep Loganathan, Andy W. H. Khong, Patrick A. Naylor:
A Class of Sparseness-Controlled Algorithms for Echo Cancellation. 1591-1601 - Bo Shao, Mitsunori Ogihara, Dingding Wang, Tao Li:
Music Recommendation Based on Acoustic Features and User Access Patterns. 1602-1611 - Chung-Hsien Wu, Chia-Hsin Hsieh:
Story Segmentation and Topic Classification of Broadcast News via a Topic-Based Segmental Model and a Genetic Algorithm. 1612-1623 - Konrad Hofbauer, Gernot Kubin, W. Bastiaan Kleijn:
Speech Watermarking for Analog Flat-Fading Bandpass Channels. 1624-1637
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.