More Web Proxy on the site http://driver.im/

research-article

Open access

DNN-HMM based Automatic Speech Recognition for HRI Scenarios

Authors:

Juan Pablo Escudero,

Néstor Becerra YomaAuthors Info & Claims

HRI '18: Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction

Pages 150 - 159

https://doi.org/10.1145/3171221.3171280

Published: 26 February 2018 Publication History

Abstract

In this paper, we propose to replace the classical black box integration of automatic speech recognition technology in HRI applications with the incorporation of the HRI environment representation and modeling, and the robot and user states and contexts. Accordingly, this paper focuses on the environment representation and modeling by training a deep neural network-hidden Markov model based automatic speech recognition engine combining clean utterances with the acoustic-channel responses and noise that were obtained from an HRI testbed built with a PR2 mobile manipulation robot. This method avoids recording a training database in all the possible acoustic environments given an HRI scenario. Moreover, different speech recognition testing conditions were produced by recording two types of acoustics sources, i.e. a loudspeaker and human speakers, using a Microsoft Kinect mounted on top of the PR2 robot, while performing head rotations and movements towards and away from the fixed sources. In this generic HRI scenario, the resulting automatic speech recognition engine provided a word error rate that is at least 26% and 38% lower than publicly available speech recognition APIs with the playback (i.e. loudspeaker) and human testing databases, respectively, with a limited amount of training data.

References

[1]

M. A. Goodrich and A. C. Schultz. 2007. Human--Robot Interaction: A Survey. Foundations and Trends in Human--Computer Interaction, vol. 1, no. 3, p. 203--275.

Digital Library

[2]

L. S. Lopes and A. Teixeira. 2000. Human-robot interaction through spoken language dialogue. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Takamatsu, Japan.

[3]

G. Hoffman and K. Vanunu. 2013. Effects of robotic companionship on music enjoyment and agent perception. In Proceedings of ACM/IEEE International Conference on Human-Robot Interaction (HRI), Tokyo, Japan.

Digital Library

[4]

C. Y. Lin, K. T. Song, Y. W. Chen, S. C. Chien, S. H. Chen, C. Y. Chiang, J. H. Yang, Y. C. Wu and T. J. Liu. 2012. User identification design by fusion of face recognition and speaker recognition. In Proceedings of 12th International Conference on Control, Automation and Systems, JeJu Island, South Korea.

[5]

K. Zheng, D. F. Glas, T. Kanda, H. Ishiguro and N. Hagita. 2013. Designing and Implementing a Human--Robot Team for Social Interactions. IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 13, no. 4, pp. 843--859.

[6]

Y. Kondo, K. Takemura, J. Takamatsu and T. Ogasawara. 2013. A gesture-centric android system for multi-party human-robot interaction. Journal of Human-Robot Interaction, vol. 2, no. 1, pp. 133--151.

Digital Library

[7]

D. Wang, H. Leung, A. P. Kurian, H. J. Kim and H. Yoon. 2010. A Deconvolutive Neural Network for Speech Classification With Applications to Home Service Robot. IEEE Transactions on Instrumentation and Measurement, vol. 59, no. 12, pp. 3237 - 3243.

[8]

E. L. Meszaros, M. Chandarana, A. Trujillo and B. D. Allen. 2018. Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation. In Advances in Human Factors in Robots and Unmanned Systems. AHFE 2017. Advances in Intelligent Systems and Computing, California, LA, USA.

[9]

S. Han, J. Hong, S. Jeong and M. Hahn. 2010. Robust GSC-based speech enhancement for human machine interface. IEEE Transactions on Consumer Electronics, vol. 56, no. 2, pp. 965--970.

Digital Library

[10]

M. Staudte and M. W. Crocker. 2011. Investigating joint attention mechanisms through spoken human--robot interaction. Cognition, vol. 120, no. 2, pp. 268--291.

[11]

H. Polido. 2014. DARPA Robotics Challenge. Worcester Polytechnic Institute.

[12]

H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda and E. Osawa. 1997. Robocup: The robot world cup initiative. In Proceedings of the first international conference on Autonomous agents, Marina del Rey, CA, USA.

Digital Library

[13]

L. Zhang, L. Du and Z. B. 2016. Deep Learning for Remote Sensing Data: A Technical Tutorial on the State of the Art. IEEE Geoscience and Remote Sensing Magazine, vol. 4, no. 2, pp. 22--40.

[14]

S. E. Umbaugh. 2010. Digital image processing and analysis: human and computer vision applications with CVIPtools. CRC press.

Digital Library

[15]

W. Burger and M. J. Burge. 2016. Digital image processing: an algorithmic introduction using Java, Springer.

Digital Library

[16]

J. Nakamura. 2016. Image sensors and signal processing for digital still cameras, CRC press.

[17]

S. Young. 2008. HMMs and Related Speech Recognition Technologies. In Springer Handbook of Speech Processing, Springer, pp. 539--558.

[18]

X. D. Huang, Y. Ariki, and M. A. Jack. 1990. Hidden Markov models for speech recognition. Edinburgh university press Edinburgh, vol. 2004.

[19]

R. Justo and M. I. Torres. 2015. Integration of complex language models in ASR and LU systems. Pattern Analysis and Applications, vol. 18, no. 3, pp. 493--505.

Digital Library

[20]

S. F. Chen, D. Beeferman and R. Rosenfeld. 1998. Evaluation metrics for language models. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, pp. 275--280.

[21]

M. Chetouani, B. Gas and J. Zarader. 2002. Discriminative Training for Neural Predictive Coding Applied to Speech Features Extraction. In Proceedings of the 2002 International Joint Conference on Neural Networks, Honolulu, HI, USA.

[22]

N. Dave. 2013. Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition. International Journal for Advance Research in Engineering and Technology, vol. 1, no. 6, pp. 1--5.

[23]

S. Furui. 1986. Speaker independent isolated word recognition using dynamic features of speech spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 34, no. 1, p. 52--59.

[24]

L. Bahl, R. Bakis, E. Jelinek and R. Mercer. 1980. Language-model/acoustic channel balance mechanism. IBM Technical Disclosure Bulletin, vol. 23, no. 7B, pp. 3464--3465.

[25]

G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. R. Mohamed, N. Jaitly and B. Kingsbury. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82--97.

[26]

J. Godfrey and E. Holliman. 1997. Switchboard-1 Release 2. Linguistic Data Consortium, Philadelphia.

[27]

G. E. Hinton, S. Osindero and Y.-W. Teh. 2006. A fast learning algorithm for deep belief nets. Neural computation, vol. 18, no. 7, pp. 1527--1554.

Digital Library

[28]

J. Schröder, J. Anemüller and S. Goetze. 2016. Performance comparison of GMM, HMM and DNN based approaches for acoustic event detection within Task 3 of the DCASE 2016 challenge. In Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events, Budapest, Hungary.

[29]

S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural Computation, vol. 9, no. 8, p. 1735--1780.

Digital Library

[30]

O. Abdel-Hamid, A. R. Mohamed, H. Jiang, L. Deng, G. Penn and D. Yu. 2014. Convolutional neural networks for speech recognition. IEEE/ACM Transactions on audio, speech, and language processing, vol. 22, no. 10, pp. 1533--1545.

Digital Library

[31]

A. Graves, A. R. Mohamed and G. Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada.

[32]

Z. Tang, D. Wang and Z. Zhang. 2016. Recurrent neural network training with dark knowledge transfer. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Shanghai, China.

[33]

J. Li, A. Mohamed, G. Zweig and Y. Gong. 2015. LSTM time and frequency recurrence for automatic speech recognition. In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA.

[34]

T. N. Sainath and B. Li. 2016. Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks. In Proceedings of INTERSPEECH, San Francisco, USA.

[35]

Y. Liu and K. Kirchhoff. 2016. Novel Front-End Features Based on Neural Graph Embeddings for DNN-HMM and LSTM-CTC Acoustic Modeling. In Proceedings of INTERSPEECH, San Francisco, USA.

[36]

D. Yu, W. Xiong, J. Droppo, A. Stolcke, G. Ye, J. Li and G. Zweig. 2016. Deep convolutional neural networks with layer-wise context expansion and attention. In Proceedings of INTERSPEECH, San Francisco, USA.

[37]

Y. Qian, M. Bi, T. Tan and K. Yu. 2016. Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 12, pp. 2263--2276.

Digital Library

[38]

V. Mitra and H. Franco. 2016. Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition. In Proceedings of INTERSPEECH, San Francisco, USA.

[39]

C. Weng, D. Yu, M. L. Seltzer and J. Droppo. 2014. Single-channel mixed speech recognition using deep neural networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy.

[40]

S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey and others. 2006. The HTK book. Cambridge university engineering department, vol. 3, p. 175.

[41]

K. F. Lee, H. W. Hon and R. Reddy. 1990. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38, no. 1, pp. 35--45.

[42]

W. Walker, P. Lamere, P. Kwok, B. Raj, R. Singh, E. Gouvea, P. Wolf and J. Woelfel. 2004. Sphinx-4: A flexible open source framework for speech recognition. Sun Microsystems, Inc.

[43]

A. Lee, T. Kawahara and K. Shikano. 2001. JULIUS - an open source real-time large vocabulary recognition engine. In Proceeding of INTERSPEECH, Aalborg, Denmark.

[44]

D. Povey, A. Ghoshal, G. Boulianne, N. Goel, M. Hannemann, Y. Qian, P. Schwarz and G. Stemmer. 2011. The Kaldi Speech Recognition Toolkit. In Proceedings of ASRU, Hawaii, USA, December.

[45]

D. Bolaños. 2012. The Bavieca open-source speech recognition toolkit. In Proceedings of IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA.

[46]

D. O. Johnson, R. H. Cuijpers, J. F. Juola, E. Torta, M. Simonov, A. Frisiello, M. Bazzani, W. Yan, C. Weber, S. Wermter, N. Meins, J. Oberzaucher, P. Panek, G. Edelmayer and P. Mayer. 2014. Socially Assistive Robots: A Comprehensive Approach to Extending Independent Living. International Journal of Social Robotics, vol. 6, no. 2, p. 195--211.

[47]

J. F. Lehman. 2014. Robo fashion world: a multimodal corpus of multi-child human-computer interaction. In Proceedings of the 2014 Workshop on Understanding and Modeling Multiparty, Multimodal Interactions, Istanbul, Turkey.

Digital Library

[48]

F. Cutugno, A. Finzi, M. Fiore, E. Leone and S. Rossi. 2013. Interacting with robots via speech and gestures, an integrated architecture. In Proceedings of INTERSPEECH, Lyon, France.

[49]

K. Zinchenko, C. Y. Wu and K. T. Song. 2017. A Study on Speech Recognition Control for a Surgical Robot. IEEE Transactions on Industrial Informatics, vol. 13, no. 2, pp. 607--615.

[50]

C. Matuszek, L. Bo, L. Zettlemoyer and D. Fox. 2014. Learning from Unscripted Deictic Gesture and Language for Human-Robot Interactions. In Proceedings of the 28th National Conference on Artificial Intelligence, Québec City, Quebec, Canada.

Digital Library

[51]

J. Kennedy, S. Lemaignan, C. Montassier, P. Lavalade, B. Irfan, F. Papadopoulos, E. Senft and T. Belpaeme. 2017. Child speech recognition in human-robot interaction: evaluations and recommendations. In Proceedings of ACM/IEEE International Conference on Human-Robot Interaction, Vienna, Austria.

Digital Library

[52]

P. Lange and D. Suendermann-Oeft. 2014. Tuning Sphinx to Outperform Google's Speech Recognition API. In Proceedings of the Conference on Electronic Speech Signal Processing, Dresden, Germany.

[53]

O. Mubin, J. Henderson and C. Bartneck. 2014. You just do not understand me! Speech Recognition in Human Robot Interaction. In Proceedings of 23rd IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, Scotland.

[54]

M. Marge, C. Bonial, B. Byrne, T. Cassidy, A. W. Evans, S. G. Hill and C. Voss. 2017. Applying the Wizard-of-Oz technique to multimodal human-robot dialogue. arXiv preprint arXiv:1703.03714.

[55]

P. Sequeira, P. Alves-Oliveira, T. Ribeiro, E. Di Tullio, S. Petisca, F. S. Melo, G. Castellano and A. Paiva. 2016. Discovering social interaction strategies for robots from restricted-perception Wizard-of-Oz studies. In Proceedings of 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Christchurch, New Zealand.

Digital Library

[56]

K. Hensby, J. Wiles, M. Boden, S. Heath, M. Nielsen, P. Pounds, J. Riddell, K. Rogers, N. Rybak, V. Slaughter, M. Smith, J. Taufatofua, P. Worthy and J. Weigel. 2016. Hand in hand: Tools and techniques for understanding children- touch with a social robot. In Proceedings of 11th ACM/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand.

Digital Library

[57]

G. Hoffman. OpenWoZ: A Runtime-Configurable Wizard-of-Oz Framework for Human-Robot Interaction. 2016. In Proceedings of AAAI Spring Symposium Series, Palo Alto, CA, USA.

[58]

N. Martelaro. 2016. Wizard-of-Oz Interfaces as a Step Towards Autonomous HRI. In Proceedings of AAAI Spring Symposium Series, Palo Alto, CA, USA.

[59]

S. Pourmehr, J. Thomas and R. Vaughan. 2016. What untrained people do when asked "make the robot come to you". In Proceedings of 11th ACM/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand.

Digital Library

[60]

E. Senft, P. Baxter, J. Kennedy, S. Lemaignan and T. Belpaeme. 2016. Providing a robot with learning abilities improves its perception by users. In Proceedings of 11th ACM/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand.

Digital Library

[61]

J. M. K. Westlund and C. Breazeal. 2016. Transparency, teleoperation, and children' understanding of social robots. In Proceedings of 11th ACM/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand.

Digital Library

[62]

H. W. Löllmann, A. Moore, P. A. Naylor, B. Rafaely, R. Horaud, A. Mazel and W. Kellermann. 2017. Microphone array signal processing for robot audition. In Proceedings of Hands-free Speech Communications and Microphone Arrays, San Francisco, CA, USA.

[63]

A. Deleforge and W. Kellermann. 2015. Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia.

[64]

J. Novoa, J. Wuth, J. P. Escudero, J. Fredes, R. Mahu, R. Stern and N. B. Yoma. 2017. Robustness over time-varying channels in DNN-HMM ASR based human-robot interaction. In Proceedings of Interspeech, Stockholm, Sweden.

[65]

K. Dautenhahn, M. Walters, S. Woods, K. L. Koay, C. L. Nehaniv, A. Sisbot, R. Alami and T. Siméon. 2006. How may I serve you?: a robot companion approaching a seated person in a helping context. In Proceedings of ACM Conference on Human Robot Interaction, Salt Lake City, UT, USA.

Digital Library

[66]

J. Novoa, J. Wuth, J. P. Escudero, J. Fredes, R. Mahu and N. Becerra Yoma. 2017. Multichannel Robot Speech Recognition Database: MChRSR. arXiv preprint arXiv: 1801.00061.

[67]

A. Farina. Simultaneous measurement of impulse response and distortion with a swept-sine technique. 2000. In Proceedings of 108th Audio Engineering Society Convention, Paris, France.

[68]

G. Hirsch. 2002. Experimental Framework for the Performance Evaluation of Speech Recognition Front-ends on a Large Vocabulary Task, Version 2.0, AU/417/02. ETSI STQ Aurora DSR Working Group.

[69]

G. Hirsch. 2005. FaNT filtering and noise adding tool. Niederrhein University of Applied Sciences.

[70]

S. Sivasankaran, E. Vincent and I. Illina. 2017. A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions. Computer Speech & Language, vol. 46, no. Supplement C, pp. 444--460.

Digital Library

[71]

P. Lin, D.-C. Lyu, F. Chen, S.-S. Wang and Y. Tsao. 2017. Multi-style learning with denoising autoencoders for acoustic modeling in the internet of things (IoT). Computer Speech & Language, vol. 46, no. Supplement C, pp. 481--495.

Digital Library

[72]

K. Veselý, A. Ghoshal, L. Burget and D. Povey. 2013. Sequence-discriminative training of deep neural networks. In Proceeding of INTERSPEECH, Lyon, France.

[73]

J.-L. Gauvain, L. Lamel and M. Adda-Decker.1995. Developments in continuous speech dictation using the ARPA WSJ task. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, USA.

Digital Library

[74]

A. Zhang. Speech Recognition (Version 3.7). 2017. {Online}. Available: https://github.com/Uberi/speech_recognition#readme. {Accessed 5th September 2017}.

[75]

B. Li, T. Sainath, A. Narayanan, J. Caroselli, M. Bacchiani, A. Misra, I. Shafran, H. Sak, G. Pundak, K. Chin and others. 2017. Acoustic Modeling for Google Home. In Proceedings of INTERSPEECH, Stockholm, Sweden.

[76]

G. Saon, H.-K. J. Kuo, S. Rennie and M. Picheny. 2015. The IBM 2015 English Conversational Telephone Speech Recognition System. In Proceedings of INTERSPEECH, Dresden, Germany.

[77]

W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu and G. Zweig. 2017. The microsoft 2016 conversational speech recognition system. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA.

Cited By

Grágeda NBusso CAlvarado EGarcía RMahu RHuenupan FYoma N(2025)Speech emotion recognition in real static and dynamic human-robot interaction scenariosComputer Speech & Language10.1016/j.csl.2024.10166689(101666)Online publication date: Jan-2025
https://doi.org/10.1016/j.csl.2024.101666
Tang PZhang ZTong JLong THuang CQi Z(2024)Predicting transformer temperature field based on physics‐informed neural networksHigh Voltage10.1049/hve2.12435Online publication date: 9-May-2024
https://doi.org/10.1049/hve2.12435
Vitale VCutugno FOriglia ACoro G(2024)Exploring emergent syllables in end-to-end automatic speech recognizers through model explainability techniqueNeural Computing and Applications10.1007/s00521-024-09435-136:12(6875-6901)Online publication date: 13-Feb-2024
https://doi.org/10.1007/s00521-024-09435-1
Show More Cited By

Index Terms

DNN-HMM based Automatic Speech Recognition for HRI Scenarios

Recommendations

Automatic Speech Recognition for Indoor HRI Scenarios

This article presents a stand-alone automatic speech recognition system that accounts for listener movement, time-varying reverberation effects, environmental noise, and user position information for beamforming approaches in an HRI setting. We raise ...
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Syllable-based automatic arabic speech recognition in noisy-telephone channel

The performance of well-trained speech recognizers using high quality full bandwidth speech data is usually degraded when used in real world environments. In particular, telephone speech recognition is extremely difficult due to the limited bandwidth of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

HRI '18: Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction

February 2018

468 pages

ISBN:9781450349536

DOI:10.1145/3171221

General Chairs:
Takayuki Kanda
ATR, Japan
,
Selma Ŝabanović
Indiana University Bloomington, USA
,
Program Chairs:
Guy Hoffman
Cornell University, USA
,
Adriana Tapus
ENSTA-ParisTech, France

Copyright © 2018 Owner/Author.

This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
IEEE-RAS: Robotics and Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 February 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ONRG
Conicyt-PCHA/Doctorado
Conicyt-Fondecyt

Conference

HRI '18

Sponsor:

SIGAI
SIGCHI
IEEE-RAS

HRI '18: ACM/IEEE International Conference on Human-Robot Interaction

March 5 - 8, 2018

IL, Chicago, USA

Acceptance Rates

HRI '18 Paper Acceptance Rate 49 of 206 submissions, 24%;

Overall Acceptance Rate 268 of 1,124 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
1,904
Total Downloads

Downloads (Last 12 months)206
Downloads (Last 6 weeks)24

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Grágeda NBusso CAlvarado EGarcía RMahu RHuenupan FYoma N(2025)Speech emotion recognition in real static and dynamic human-robot interaction scenariosComputer Speech & Language10.1016/j.csl.2024.10166689(101666)Online publication date: Jan-2025
https://doi.org/10.1016/j.csl.2024.101666
Tang PZhang ZTong JLong THuang CQi Z(2024)Predicting transformer temperature field based on physics‐informed neural networksHigh Voltage10.1049/hve2.12435Online publication date: 9-May-2024
https://doi.org/10.1049/hve2.12435
Vitale VCutugno FOriglia ACoro G(2024)Exploring emergent syllables in end-to-end automatic speech recognizers through model explainability techniqueNeural Computing and Applications10.1007/s00521-024-09435-136:12(6875-6901)Online publication date: 13-Feb-2024
https://doi.org/10.1007/s00521-024-09435-1
Alvarado EGrágeda NLuzanto AMahu RWuth JMendoza LStern RYoma N(2023)Automatic Detection of Dyspnea in Real Human–Robot Interaction ScenariosSensors10.3390/s2317759023:17(7590)Online publication date: 1-Sep-2023
https://doi.org/10.3390/s23177590
Li SLiu YChen YFeng HShen PWu Y(2023)Voice Interaction Recognition Design in Real-Life Scenario Mobile Robot ApplicationsApplied Sciences10.3390/app1305335913:5(3359)Online publication date: 6-Mar-2023
https://doi.org/10.3390/app13053359
Fuentes CPorcheron MFischer J(2023)RoboClean: Contextual Language Grounding for Human-Robot Interactions in Specialised Low-Resource EnvironmentsProceedings of the 5th International Conference on Conversational User Interfaces10.1145/3571884.3597137(1-11)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3571884.3597137
Guo SKadeer ZWumaier AWang LFan C(2023)Multi-Feature and Multi-Modal Mispronunciation Detection and Diagnosis Method Based on the Squeezeformer EncoderIEEE Access10.1109/ACCESS.2023.327883711(66245-66256)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3278837
Bongiovanni ADe Luca AGava LGrassi LLagomarsino MLapolla MMarino ARoncagliolo PMacciò SCarfì AMastrogiovanni F(2023)Gestural and Touchscreen Interaction for Human-Robot Collaboration: A Comparative StudyIntelligent Autonomous Systems 1710.1007/978-3-031-22216-0_9(122-138)Online publication date: 18-Jan-2023
https://doi.org/10.1007/978-3-031-22216-0_9
Wen HChen HChen CCiamarra MCheong S(2022)Hidden-state modeling of a cross-section of geoelectric time series data can provide reliable intermediate-term probabilistic earthquake forecasting in TaiwanNatural Hazards and Earth System Sciences10.5194/nhess-22-1931-202222:6(1931-1954)Online publication date: 9-Jun-2022
https://doi.org/10.5194/nhess-22-1931-2022
Ntalampiras S(2022)Learning relationships between audio signals based on reservoir networks2022 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN55064.2022.9892009(1-6)Online publication date: 18-Jul-2022
https://doi.org/10.1109/IJCNN55064.2022.9892009
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents