default search action
Tamás Grósz
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j9]Georgios Karakasidis, Mikko Kurimo, Peter Bell, Tamás Grósz:
Comparison and analysis of new curriculum criteria for end-to-end ASR. Speech Commun. 163: 103113 (2024) - [j8]Aku Rouhe, Tamás Grósz, Mikko Kurimo:
Principled Comparisons for End-to-End Speech Recognition: Attention vs Hybrid at the 1000-Hour Scale. IEEE ACM Trans. Audio Speech Lang. Process. 32: 623-638 (2024) - [j7]Dejan Porjazovski, Tamás Grósz, Mikko Kurimo:
From Raw Speech to Fixed Representations: A Comprehensive Evaluation of Speech Embedding Techniques. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3546-3560 (2024) - [c42]Anne Marte Haug Olstad, Anna Smolander, Sofia Strömbergsson, Sari Ylinen, Minna Lehtonen, Mikko Kurimo, Yaroslav Getman, Tamás Grósz, Xinwei Cao, Torbjørn Svendsen, Giampiero Salvi:
Collecting Linguistic Resources for Assessing Children's Pronunciation of Nordic Languages. LREC/COLING 2024: 3529-3537 - [c41]Anja Virkkunen, Marek Sarvas, Guangpu Huang, Tamás Grósz, Mikko Kurimo:
Investigating the Clusters Discovered By Pre-Trained AV-HuBERT. ICASSP 2024: 11196-11200 - [c40]Mehedi Hasan Bijoy, Dejan Porjazovski, Nhan Phan, Guangpu Huang, Tamás Grósz, Mikko Kurimo:
Multimodal Humor Detection and Social Perception Prediction. MuSe@ACM Multimedia 2024: 60-64 - 2023
- [j6]Yaroslav Getman, Nhan Phan, Ragheb Al-Ghezi, Ekaterina Voskoboinik, Mittul Singh, Tamás Grósz, Mikko Kurimo, Giampiero Salvi, Torbjørn Svendsen, Sofia Strömbergsson, Anna-Riikka Smolander, Sari Ylinen:
Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children. IEEE Access 11: 86025-86037 (2023) - [j5]Anssi Moisio, Dejan Porjazovski, Aku Rouhe, Yaroslav Getman, Anja Virkkunen, Ragheb Al-Ghezi, Mietta Lennes, Tamás Grósz, Krister Lindén, Mikko Kurimo:
Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks. Lang. Resour. Evaluation 57(3): 1295-1327 (2023) - [c39]Dejan Porjazovski, Tamás Grósz, Mikko Kurimo:
Topic Identification for Spontaneous Speech: Enriching Audio Features with Embedded Linguistic Information. EUSIPCO 2023: 396-400 - [c38]Tamás Grósz, Yaroslav Getman, Ragheb Al-Ghezi, Aku Rouhe, Mikko Kurimo:
Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model. INTERSPEECH 2023: 196-200 - [c37]Dejan Porjazovski, Yaroslav Getman, Tamás Grósz, Mikko Kurimo:
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference. ACM Multimedia 2023: 9477-9481 - [c36]Tamás Grósz, Anja Virkkunen, Dejan Porjazovski, Mikko Kurimo:
Discovering Relevant Sub-spaces of BERT, Wav2Vec 2.0, ELECTRA and ViT Embeddings for Humor and Mimicked Emotion Recognition with Integrated Gradients. MuSe@ACM Multimedia 2023: 27-34 - [c35]Nhan Phan, Tamás Grósz, Mikko Kurimo:
CaptainA - A mobile app for practising Finnish pronunciation. NoDaLiDa 2023: 265-270 - [c34]Reima Karhila, Sari Ylinen, Anna-Riikka Smolander, Aku Rouhe, Ragheb Al-Ghezi, Yaroslav Getman, Tamás Grósz, Maria Uther, Mikko Kurimo:
A pronunciation Scoring System Embedded into Children's Foreign Language Learning Games with Experimental Verification of Learning Benefits. SLaTE 2023: 21-25 - [c33]Yaroslav Getman, Ragheb Al-Ghezi, Tamás Grósz, Mikko Kurimo:
Multi-task wav2vec2 Serving as a Pronunciation Training System for Children. SLaTE 2023: 36-40 - [i10]Dejan Porjazovski, Tamás Grósz, Mikko Kurimo:
Topic Identification For Spontaneous Speech: Enriching Audio Features With Embedded Linguistic Information. CoRR abs/2307.11450 (2023) - [i9]Dejan Porjazovski, Yaroslav Getman, Tamás Grósz, Mikko Kurimo:
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference. CoRR abs/2310.10179 (2023) - 2022
- [c32]Tamás Grósz, Noora Kallioniemi, Harri Kiiskinen, Kimmo Laine, Anssi Moisio, Tommi Römpötti, Anja Virkkunen, Hannu Salmi, Mikko Kurimo, Jorma Laaksonen:
Tracing Signs of Urbanity in the Finnish Fiction Film of the 1950s: Toward a Multimodal Analysis of Audiovisual Data. DHNB 2022: 63-78 - [c31]Georgios Karakasidis, Tamás Grósz, Mikko Kurimo:
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR. INTERSPEECH 2022: 66-70 - [c30]Yaroslav Getman, Ragheb Al-Ghezi, Katja Voskoboinik, Tamás Grósz, Mikko Kurimo, Giampiero Salvi, Torbjørn Svendsen, Sofia Strömbergsson:
wav2vec2-based Speech Rating System for Children with Speech Sound Disorder. INTERSPEECH 2022: 3618-3622 - [c29]Tamás Grósz, Dejan Porjazovski, Yaroslav Getman, Sudarsana Reddy Kadiri, Mikko Kurimo:
Wav2vec2-based Paralinguistic Systems to Recognise Vocalised Emotions and Stuttering. ACM Multimedia 2022: 7026-7029 - [i8]Anssi Moisio, Dejan Porjazovski, Aku Rouhe, Yaroslav Getman, Anja Virkkunen, Tamás Grósz, Krister Lindén, Mikko Kurimo:
Lahjoita puhetta - a large-scale corpus of spoken Finnish with some benchmarks. CoRR abs/2203.12906 (2022) - [i7]Georgios Karakasidis, Tamás Grósz, Mikko Kurimo:
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR. CoRR abs/2208.05782 (2022) - [i6]Tamás Grósz, Mittul Singh, Sudarsana Reddy Kadiri, Hemant Kumar Kathania, Mikko Kurimo:
End-to-end Ensemble-based Feature Selection for Paralinguistics Tasks. CoRR abs/2210.15978 (2022) - 2021
- [c28]Tamás Grósz, Mikko Kurimo:
LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells. TDS 2021: 382-393 - 2020
- [j4]Rudolf Ferenc, Dénes Bán, Tamás Grósz, Tibor Gyimóthy:
Deep learning in static, metric-based bug prediction. Array 6: 100021 (2020) - [j3]Gábor Gosztolya, Tamás Grósz, László Tóth:
Social Signal Detection by Probabilistic Sampling DNN Training. IEEE Trans. Affect. Comput. 11(1): 164-177 (2020) - [c27]Hemant Kumar Kathania, Mittul Singh, Tamás Grósz, Mikko Kurimo:
Data Augmentation Using Prosody and False Starts to Recognize Non-Native Children's Speech. INTERSPEECH 2020: 260-264 - [c26]Tamás Grósz, Mikko Kurimo:
Visual Interpretation of DNN-based Acoustic Models using Deep Autoencoders. MLVis@Eurographics/EuroVis 2020: 25-29 - [i5]Tamás Grósz, Mittul Singh, Sudarsana Reddy Kadiri, Hemant Kumar Kathania, Mikko Kurimo:
Aalto's End-to-End DNN systems for the INTERSPEECH 2020 Computational Paralinguistics Challenge. CoRR abs/2008.02689 (2020) - [i4]Hemant Kumar Kathania, Mittul Singh, Tamás Grósz, Mikko Kurimo:
Data augmentation using prosody and false starts to recognize non-native children's speech. CoRR abs/2008.12914 (2020)
2010 – 2019
- 2019
- [j2]László Varga, Attila Kovács, Tamás Grósz, Géza Thury, Flóra Hadarits, Rózsa Dégi, József Dombi:
Automatic segmentation of hyperreflective foci in OCT images. Comput. Methods Programs Biomed. 178: 91-103 (2019) - [c25]Gergely Pap, Gábor Lékó, Tamás Grósz:
A Reconstruction-Free Projection Selection Procedure for Binary Tomography Using Convolutional Neural Networks. ICIAR (1) 2019: 228-236 - [c24]Gábor Gosztolya, Ádám Pintér, László Tóth, Tamás Grósz, Alexandra Markó, Tamás Gábor Csapó:
Autoencoder-Based Articulatory-to-Acoustic Mapping for Ultrasound Silent Speech Interfaces. IJCNN 2019: 1-8 - [c23]Tamás Gábor Csapó, Mohammed Salah Al-Radhi, Géza Németh, Gábor Gosztolya, Tamás Grósz, László Tóth, Alexandra Markó:
Ultrasound-Based Silent Speech Interface Built on a Continuous Vocoder. INTERSPEECH 2019: 894-898 - [p1]György Kovács, Tamás Grósz, Tamás Váradi:
Using Deep Rectifier Neural Nets and Probabilistic Sampling for Topical Unit Classification. Cognitive Infocommunications, Theory and Applications 2019: 1-24 - [i3]Gábor Gosztolya, Ádám Pintér, László Tóth, Tamás Grósz, Alexandra Markó, Tamás Gábor Csapó:
Autoencoder-Based Articulatory-to-Acoustic Mapping for Ultrasound Silent Speech Interfaces. CoRR abs/1904.05259 (2019) - [i2]Tamás Gábor Csapó, Mohammed Salah Al-Radhi, Géza Németh, Gábor Gosztolya, Tamás Grósz, László Tóth, Alexandra Markó:
Ultrasound-based Silent Speech Interface Built on a Continuous Vocoder. CoRR abs/1906.09885 (2019) - 2018
- [b1]Tamás Grósz:
Training Methods for Deep Neural Network-Based Acoustic Models in Speech Recognition. University of Szeged, Hungary, 2018 - [j1]Péter Bodnár, Tamás Grósz, László Tóth, László G. Nyúl:
Efficient visual code localization with neural networks. Pattern Anal. Appl. 21(1): 249-260 (2018) - [c22]Tamás Grósz, Gábor Gosztolya, László Tóth, Tamás Gábor Csapó, Alexandra Markó:
F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces. ICASSP 2018: 291-295 - [c21]Melinda Katona, Attila Kovács, László Varga, Tamás Grósz, József Dombi, Rózsa Dégi, László G. Nyúl:
Automatic Detection and Characterization of Biomarkers in OCT Images. ICIAR 2018: 706-714 - [c20]Gábor Gosztolya, Tamás Grósz, László Tóth:
General Utterance-Level Feature Extraction for Classifying Crying Sounds, Atypical & Self-Assessed Affect and Heart Beats. INTERSPEECH 2018: 531-535 - [c19]László Tóth, Gábor Gosztolya, Tamás Grósz, Alexandra Markó, Tamás Gábor Csapó:
Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces. INTERSPEECH 2018: 3172-3176 - 2017
- [c18]Tamás Grósz, Gábor Gosztolya, László Tóth:
Training Context-Dependent DNN Acoustic Models Using Probabilistic Sampling. INTERSPEECH 2017: 1621-1625 - [c17]Tamás Grósz, Gábor Gosztolya, László Tóth:
A Comparative Evaluation of GMM-Free State Tying Methods for ASR. INTERSPEECH 2017: 1626-1630 - [c16]Gábor Gosztolya, Róbert Busa-Fekete, Tamás Grósz, László Tóth:
DNN-Based Feature Extraction and Classifier Combination for Child-Directed Speech, Cold and Snoring Identification. INTERSPEECH 2017: 3522-3526 - [c15]Tamás Gábor Csapó, Tamás Grósz, Gábor Gosztolya, László Tóth, Alexandra Markó:
DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface. INTERSPEECH 2017: 3672-3676 - 2016
- [c14]György Kovács, Tamás Grósz, Tamás Váradi:
Topical unit classification using deep neural nets and probabilistic sampling. CogInfoCom 2016: 199-204 - [c13]Gábor Gosztolya, László Tóth, Tamás Grósz, Veronika Vincze, Ildikó Hoffmann, Gréta Szatlóczki, Magdolna Pákáski, János Kálmán:
Detecting Mild Cognitive Impairment from Spontaneous Speech by Correlation-Based Phonetic Feature Selection. INTERSPEECH 2016: 107-111 - [c12]Gábor Gosztolya, Tamás Grósz, György Szaszák, László Tóth:
Estimating the Sincerity of Apologies in Speech by DNN Rank Learning and Prosodic Analysis. INTERSPEECH 2016: 2026-2030 - [c11]Gábor Gosztolya, Tamás Grósz, Róbert Busa-Fekete, László Tóth:
Determining Native Language and Deception Using Phonetic Features and Classifier Combination. INTERSPEECH 2016: 2418-2422 - [c10]Gábor Gosztolya, Tamás Grósz, László Tóth:
GMM-Free Flat Start Sequence-Discriminative DNN Training. INTERSPEECH 2016: 3409-3413 - [i1]Gábor Gosztolya, Tamás Grósz, László Tóth:
GMM-Free Flat Start Sequence-Discriminative DNN Training. CoRR abs/1610.03256 (2016) - 2015
- [c9]Gábor Gosztolya, Tamás Grósz, László Tóth, David Imseng:
Building context-dependent DNN acoustic models using Kullback-Leibler divergence-based state tying. ICASSP 2015: 4570-4574 - [c8]Tamás Grósz, Róbert Busa-Fekete, Gábor Gosztolya, László Tóth:
Assessing the degree of nativeness and parkinson's condition using Gaussian processes and deep rectifier neural networks. INTERSPEECH 2015: 919-923 - 2014
- [c7]Péter Bodnár, Tamás Grósz, László Tóth, László G. Nyúl:
Localization of Visual Codes in the DCT Domain Using Deep Rectifier Neural Networks. ANNIIP 2014: 37-44 - [c6]Gábor Gosztolya, Tamás Grósz, Róbert Busa-Fekete, László Tóth:
Detecting the intensity of cognitive and physical load using AdaBoost and deep rectifier neural networks. INTERSPEECH 2014: 452-456 - [c5]Tamás Grósz, Péter Bodnár, László Tóth, László G. Nyúl:
QR code localization using deep neural networks. MLSP 2014: 1-6 - [c4]Tamás Grósz, Gábor Gosztolya, László Tóth:
A Sequence Training Method for Deep Rectifier Neural Networks in Speech Recognition. SPECOM 2014: 81-88 - [c3]György Kovács, László Tóth, Tamás Grósz:
Robust Multi-Band ASR Using Deep Neural Nets and Spectro-temporal Features. SPECOM 2014: 386-393 - [c2]Tamás Grósz, István Nagy T.:
Document Classification with Deep Rectifier Neural Networks and Probabilistic Sampling. TSD 2014: 108-115 - 2013
- [c1]László Tóth, Tamás Grósz:
A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition. TSD 2013: 36-43
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-02 22:26 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint