default search action
24th SPECOM 2022: Gurugram, India
- S. R. Mahadeva Prasanna, Alexey Karpov, K. Samudravijaya, Shyam S. Agrawal:
Speech and Computer - 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022, Proceedings. Lecture Notes in Computer Science 13721, Springer 2022, ISBN 978-3-031-20979-6 - Eleonora Akinshina, Tatiana Y. Sherstinova:
Thematic Diversity of Everyday Russian Discourse: A Case Study Based on the ORD Corpus. 1-9 - Jahangir Alam, Woo Hyun Kang, Abderrahim Fathan:
Neural Embedding Extractors for Text-Independent Speaker Verification. 10-23 - Anastasia Avdeeva, Sergey Novoselov:
Deep Speaker Embeddings Based Online Diarization. 24-32 - Shikha Baghel, S. R. M. Prasanna, Prithwijit Guha:
Overlapped Speech Detection Using AM-FM Based Time-Frequency Representations. 33-43 - Oindrila Banerjee, D. Govind, Akhilesh Kumar Dubey, Suryakanth V. Gangashetty:
Significance of Dimensionality Reduction in CNN-Based Vowel Classification from Imagined Speech Using Electroencephalogram Signals. 44-55 - Shweta Bansal, Shambhu Sharan, Shyam S. Agrawal:
Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language. 56-63 - Rhythm Bhatia, Tomi H. Kinnunen:
An Initial Study on Birdsong Re-synthesis Using Neural Vocoders. 64-74 - Mrinmoy Bhattacharjee, S. R. Mahadeva Prasanna, Prithwijit Guha:
Speech Music Overlap Detection Using Spectral Peak Evolutions. 75-86 - Joyshree Chakraborty, Rohit Sinha, Priyankoo Sarmah:
Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English. 87-98 - Daniil Chernyshev, Boris V. Dobrov:
ClusterVote: Automatic Summarization Dataset Construction with Document Clusters. 99-113 - Shanatip Choosaksakunwiboon, Karla Pizzi, Ching-Yu Kao:
Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples. 114-127 - Maria Chubarova, Tatiana Shevchenko:
Celtic English Continuum in Pitch Patterns of Spontaneous Talk: Evidence of Long-Term Contacts. 128-138 - Dadi Ramesh, Suresh Kumar Sanampudi:
Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks. 139-154 - Goutam Datta, Nisheeth Joshi, Kusum Gupta:
Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore vs BLEU Score. 155-162 - Denis Dresvyanskiy, Yamini Sinha, Matthias Busch, Ingo Siegert, Alexey Karpov, Wolfgang Minker:
DyCoDa: A Multi-modal Data Collection of Multi-user Remote Survival Game Recordings. 163-177 - José Vicente Egas López, Róbert Busa-Fekete, Gábor Gosztolya:
On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection. 178-187 - Abderrahim Fathan, Jahangir Alam, Woo Hyun Kang:
Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection. 188-200 - Parismita Gogoi, Priyankoo Sarmah, S. R. M. Prasanna:
Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech. 201-213 - Aleksey Grigorev, Anna V. Kurazhova, Egor Kleshnev, Aleksandr Nikolaev, Olga V. Frolova, Elena E. Lyakso:
An Electroglottographic Method for Assessing the Emotional State of the Speaker. 214-225 - Priyanka Gupta, Hemant A. Patil:
Significance of Distance on Pop Noise for Voice Liveness Detection. 226-237 - Vishwa Gupta, Gilles Boulianne:
CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings. 238-251 - Alisa P. Gvozdeva, Alexander M. Lunichkin, Larisa G. Zaytseva, Elena A. Ogorodnikova, Irina G. Andreeva:
Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach. 252-264 - Maria-Loulou Hajj, Martin Lenglet, Olivier Perrotin, Gérard Bailly:
Comparing NLP Solutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems. 265-278 - Attila Zoltán Jenei, Gábor Kiss, Dávid Sztahó:
Detection of Speech Related Disorders by Pre-trained Embedding Models Extracted Biomarkers. 279-289 - Mélanie Jouaiti, Kerstin Dautenhahn:
Multi-label Dysfluency Classification. 290-301 - Mélanie Jouaiti, Kerstin Dautenhahn:
Harnessing Uncertainty - Multi-label Dysfluency Classification with Uncertain Labels. 302-311 - Aastha Kachhi, Anand Therattil, Priyanka Gupta, Hemant A. Patil:
Continuous Wavelet Transform for Severity-Level Classification of Dysarthria. 312-324 - Aastha Kachhi, Anand Therattil, Ankur T. Patil, Hardik B. Sailor, Hemant A. Patil:
Significance of Energy Features for Severity Classification of Dysarthria. 325-337 - Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan:
An Analytic Study on Clustering-Based Pseudo-labels for Self-supervised Deep Speaker Verification. 338-348 - Irina S. Kipyatkova:
Investigation of Transfer Learning for End-to-End Russian Speech Recognition. 349-357 - Uliana E. Kochetkova, Pavel A. Skrelin, Rada German, Daria Novoselova:
Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific. 358-371 - Liliya Komalova, Lyubov Kalyuzhnaya:
Categorization of Threatening Speech Acts. 372-381 - Evgeny Kostyuchenko, Ivan Rakhmanenko, Lidiya N. Balatskaya:
Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem. 382-390 - Dani Krebbers, Heysem Kaya, Alexey Karpov:
Multi-level Fusion of Fisher Vector Encoded BERT and Wav2vec 2.0 Embeddings for Native Language Identification. 391-403 - Devesh Kumar, Pavan Kumar V. Patil, Ayush Agarwal, S. R. Mahadeva Prasanna:
Fake Speech Detection Using OpenSMILE Features. 404-415 - Anna Leonteva, Tatiana Sokoreva:
Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction. 416-425 - Seema Lokhandwala, Priyankoo Sarmah, Rohit Sinha:
Classifying Mahout and Social Interactions of Asian Elephants Based on Trumpet Calls. 426-437 - Elena E. Lyakso, Olga V. Frolova, Anton Matveev, Yuri Matveev, Aleksey Grigorev, Olesia Makhnytkina, Nersisson Ruban:
Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. 438-450 - Raghav Magazine, Ayush Agarwal, Anand Hedge, S. R. Mahadeva Prasanna:
Fake Speech Detection Using Modulation Spectrogram. 451-463 - Danila Mamontov, Wolfgang Minker, Alexey Karpov:
Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks. 464-476 - Jose Mathew, Pranjal Sahu, Bhavuk Singhal, Aniket Joshi, Krishna Reddy Medikonda, Jairaj Sathyanarayana:
A Multi-modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain. 477-493 - Jagabandhu Mishra, S. R. Mahadeva Prasanna:
Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language Diarization Task. 494-507 - Anton Nesterenko, Ruslan Akhmerov, Yulia Matveeva, Anna Goremykina, Dmitry Astankov, Evgeniy Shuranov, Alexandra Shirshova:
Low-Resource Emotional Speech Synthesis: Transfer Learning and Data Requirements. 508-521 - Dariya Novokhrestova, Ilya A. Hodashinsky, Evgeny Kostyuchenko, Konstantin S. Sarin, Marina Bardamova:
Fuzzy Classifier for Speech Assessment in Speech Rehabilitation. 522-532 - Moumita Pakrashi, Shakuntala Mahanta:
Analysis-By-Synthesis Modeling of Bengali Intonation. 533-544 - K. S. Pavithra, H. M. Chandrashekar, Veena Karjigi:
Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech. 545-553 - Pavel Posokhov, Anastasia Matveeva, Olesia Makhnytkina, Anton Matveev, Yuri Matveev:
Personalizing Retrieval-Based Dialogue Agents. 554-566 - Rodmonga Potapova, Vsevolod Potapov, Irina Kuryanova:
Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms. 567-578 - Rodmonga Potapova, Vsevolod Potapov, Oleg Kuzmin:
Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation. 579-589 - Aditya Pusuluri, Aastha Kachhi, Hemant A. Patil:
Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification. 590-603 - Elena I. Riekhakaynen, Elena Zatevalova:
Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners. 604-615 - Elena Ryumina, Denis Ivanko:
Emotional Speech Recognition Based on Lip-Reading. 616-625 - Anna Shestakova, Andrea Corradini:
Exploring the Use of Machine Learning for Resume Recommendations. 626-640 - Tatiana Sokoreva, Tatiana Shevchenko:
The Role of Pause in Interaction: A Case of Polylogue. 641-650 - Valery D. Solovyev, Musa Islamov, Venera Bayrasheva:
Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words. 651-664 - Nikolaos Tsiftsis, Konstantinos Moustakas, Nikolaos D. Fakotakis:
Effects of Depth of Field on Focus Using a Virtual Reality Escape Room. 665-675 - Yaroslav Turovsky, Daniyar Wolf, Roman V. Meshcheryakov, Anastasia Iskhakova:
Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces. 676-687 - Spoorthy Venkatesh, Shashidhar G. Koolagudi:
Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network. 688-699 - Zhandos Yessenbayev, Zhanibek Kozhirbayev:
Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology. 700-711 - Alexander Zatvornitskiy:
Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022. 712-718
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.