Human-centered computing

Applied Filters

People

Publications

Publication Date

Searched The ACM Guide to Computing Literature (3,816,102 records)|Limit your search to The ACM Full-Text Collection (772,233 records)

Showing 1 - 20of220 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
November 2024
Accurate synthesis of dysarthric Speech for ASR data augmentation
Speech Communication (SPCO), Volume 164, Issue Chttps://doi.org/10.1016/j.specom.2024.103112
Highlights

Modified a neural multi-talker TTS by adding a dysarthria severity level coefficient and a pause insertion model to synthesize dysarthric speech for varying severity levels.
Providing data augmentation for machine learning tasks such ...
Abstract
Dysarthria is a motor speech disorder often characterized by reduced speech intelligibility through slow, uncoordinated control of speech production muscles. Automatic Speech Recognition (ASR) systems can help dysarthric talkers communicate more ...
0
Metrics
Total Citations0
research-article
October 2024
The effects of informational and energetic/modulation masking on the efficiency and ease of speech communication across the lifespan
Speech Communication (SPCO), Volume 162, Issue Chttps://doi.org/10.1016/j.specom.2024.103101
Highlights

In more naturalistic everyday settings, communication efficiency gradually improves from childhood to adulthood irrespective of the listening condition (easy vs. challenging).
Even moderate levels of background speech affect ...
Abstract
Children and older adults have greater difficulty understanding speech when there are other voices in the background (informational masking, IM) than when the interference is a steady-state noise with a similar spectral profile but is not speech (...
0
Metrics
Total Citations0
research-article
May 2024
The impact of non-native English speakers’ phonological and prosodic features on automatic speech recognition accuracy
- Ingy Farouk Emara,
- Nabil Hamdy Shaker
Speech Communication (SPCO), Volume 157, Issue Chttps://doi.org/10.1016/j.specom.2024.103038
Highlights

Higher speech intensity and lower speech rates improve automatic speech recognition accuracy.
Arab ESL teachers and students give more attention to pronunciation errors that do not affect intelligibility.
Arabic-influenced ESL ...
Abstract
The present study examines the impact of Arab speakers’ phonological and prosodic features on the accuracy of automatic speech recognition (ASR) of non-native English speech. The authors first investigated the perceptions of 30 Egyptian ESL ...
1
Metrics
Total Citations1
research-article
February 2024
The Role of Auditory and Visual Cues in the Perception of Mandarin Emotional Speech in Male Drug Addicts
Speech Communication (SPCO), Volume 155, Issue Chttps://doi.org/10.1016/j.specom.2023.103000
Highlights

Fill the research gap in the field of speech perception among drug addicts.
Reveal the presence of a disorder or deficit in multi-modal emotional speech processing in drug addicts.
Suggest that visual cues, such as facial ...
Abstract
Evidence from previous neurological studies has revealed that drugs can cause severe damage to the human brain structure, leading to significant cognitive disorders in emotion processing, such as psychotic-like symptoms (e.g., speech illusion: ...
0
Metrics
Total Citations0
research-article
October 2023
Acoustic properties of non-native clear speech: Korean speakers of English
- Ye-Jee Jung,
- Olga Dmitrieva
Speech Communication (SPCO), Volume 154, Issue Chttps://doi.org/10.1016/j.specom.2023.102982
Highlights

Non-native clear speech is acoustically distinct from casual speech.
The nature of modifications is the same in native and non-native clear speech.
The magnitude of modifications is different in native and non-native clear speech. ...
Abstract
The present study examined the acoustic properties of clear speech produced by non-native speakers of English (L1 Korean), in comparison to native clear speech. L1 Korean speakers of English (N=30) and native speakers of English (N=20) read an ...
0
Metrics
Total Citations0
review-article
October 2023
Speech emotion recognition approaches: A systematic review
Speech Communication (SPCO), Volume 154, Issue Chttps://doi.org/10.1016/j.specom.2023.102974
Abstract
The speech emotion recognition (SER) field has been active since it became a crucial feature in advanced Human–Computer Interaction (HCI), and wide real-life applications use it. In recent years, numerous SER systems have been covered by ...
Highlights

The speech-emotion recognition (SER) field became crucial in advanced Human-computer interaction (HCI).
Numerous SER systems have been proposed by researchers using Machine Learning (ML) and Deep Learning (DL).
This survey aims to ...
1
Metrics
Total Citations1
research-article
February 2023
Shared and task-specific phase coding characteristics of gamma- and theta-bands in speech perception and covert speech
- Jae Moon,
- Tom Chau
Speech Communication (SPCO), Volume 147, Issue CPages 63–73https://doi.org/10.1016/j.specom.2023.01.007
Abstract
Covert speech is the mental imagery of speaking. This task has gained increasing attention to understand the nature of thought and produce decoding methods for brain–computer interfaces. Building on previous work, we sought to ...
Highlights

Understanding speech-related temporal encoding useful for brain-computer interface training.
0
Metrics
Total Citations0
research-article
February 2023
Acoustic characterization and machine prediction of perceived masculinity and femininity in adults
Speech Communication (SPCO), Volume 147, Issue CPages 22–40https://doi.org/10.1016/j.specom.2023.01.002
Abstract
Previous research has found that human voice can provide reliable information to be used for gender identification with a high level of accuracy. In social psychology, perceived masculinity and femininity (masculinity and femininity ...
Highlights

We modelled femininity/masculinity ratings for 129 female/96 male voices.
...
0
Metrics
Total Citations0
research-article
January 2023
Vocal characteristics of accuracy in eyewitness testimony
Speech Communication (SPCO), Volume 146, Issue CPages 82–92https://doi.org/10.1016/j.specom.2022.12.001
Highlights

We demonstrate that the accuracy of statements in an eyewitness testimony can be communicated with auditory cues.
Abstract
In two studies, we examined if correct and incorrect testimony statements were produced with vocally distinct characteristics. Participants watched a staged crime film and were interviewed as eyewitnesses. Witness responses were ...
0
Metrics
Total Citations0
research-article
January 2023
Effects of hearing loss and audio-visual cues on children's speech processing speed
Speech Communication (SPCO), Volume 146, Issue CPages 11–21https://doi.org/10.1016/j.specom.2022.11.003
Highlights

Children with hearing loss process speech faster with visual cues.
Audio-visual benefits are similar to those of children with normal hearing.
Overall processing speed remains slower than that of children with normal hearing.
...
Abstract
Children with hearing loss (HL) can generally achieve functional speech perception with the assistance of hearing aids and/or cochlear implants. However, their speech processing may be less efficient than that of their peers with normal hearing (...
0
Metrics
Total Citations0
research-article
January 2023
The effect of fluency strategy training on interpreter trainees’ speech fluency: Does content familiarity matter?
Speech Communication (SPCO), Volume 146, Issue CPages 1–10https://doi.org/10.1016/j.specom.2022.11.002
Highlights

Fluency training significantly enhances the interpreter trainees’ speech fluency.
Abstract
The present study examines the effect of fluency strategy training on the speech fluency of interpreter trainees using a pretest-posttest-delayed posttest design. Moreover, it investigates whether content familiarity influences the ...
0
Metrics
Total Citations0
research-article
October 2022
Prosodic development from 4 to 10 years: Data from the Italian adaptation of the PEPS-C
Speech Communication (SPCO), Volume 144, Issue CPages 10–19https://doi.org/10.1016/j.specom.2022.08.007
Highlights

The development of prosodic functions covers a long period, till adolescence.
...
Abstract
The development of prosody covers a long age period, with some functions not well-mastered till adolescence. Moreover, languages show different prosodic developmental trajectories, which can be related to their distinctive ...
0
Metrics
Total Citations0
research-article
October 2022
The role of visual cues indicating onset times of target speech syllables in release from informational or energetic masking
Speech Communication (SPCO), Volume 144, Issue CPages 20–25https://doi.org/10.1016/j.specom.2022.08.003
Highlights

Listeners benefit from visually guided cues for the timing of syllables in noise.
Abstract
This study examined the effect of visual cues that provide the timing information of syllables in nonsense target sentences on the recognition of target speech against either a speech-spectrum noise masker or a two-talker masker. When ...
0
Metrics
Total Citations0
research-article
October 2022
Leveraging audible and inaudible signals for pronunciation training by sensing articulation through a smartphone
Speech Communication (SPCO), Volume 144, Issue CPages 42–56https://doi.org/10.1016/j.specom.2022.08.002
Highlights

Under pronunciation principles, this paper presents a smartphone-based pronunciation training system for practicing monophthong regarding vowel articulation.
Abstract
Learning a foreign language pronunciation is the most challenging task for non-native speakers. Improving pronunciation based on feedback on pronunciation error scores is also not easy for learners. Our goal is to develop an ...
0
Metrics
Total Citations0
research-article
October 2022
Arm motion symmetry in conversation
Speech Communication (SPCO), Volume 144, Issue CPages 75–88https://doi.org/10.1016/j.specom.2022.08.001
Abstract
Data-driven synthesis of human motion during conversational speech is an active research area with applications that include character animation, computer gaming and conversational agents. Natural looking motion is key to both ...
Highlights

Review the motion symmetry of multiple speakers during dyadic conversation.
...
3
Metrics
Total Citations3
research-article
September 2022
A bimodal network based on Audio–Text-Interactional-Attention with ArcFace loss for speech emotion recognition
- Yuwu Tang,
- Ying Hu,
- Liang He,
- Hao Huang
Speech Communication (SPCO), Volume 143, Issue CPages 21–32https://doi.org/10.1016/j.specom.2022.07.004
Abstract
Speech emotion recognition (SER) is an essential part of human–computer interaction. Meanwhile, the SER has widely utilized multimodal information in SER in recent years. This paper focuses on exploiting the acoustic and textual ...
Highlights

A bimodal network based on an Audio–Text-Interactional-Attention (ATIA) structure and ArcFace loss is proposed.
1
Metrics
Total Citations1
research-article
September 2022
A method for improving bot effectiveness by recognising implicit customer intent in contact centre conversations
Speech Communication (SPCO), Volume 143, Issue CPages 33–45https://doi.org/10.1016/j.specom.2022.07.003
Highlights

New method for recognising the intent of a customer contacting a CC hotline, which is designed specifically for this industry.
Abstract
Contact centre systems are increasingly using intelligent voicebots and chatbots. These solutions are constantly evolving and improving. One of the main tasks of a virtual assistant is to recognise customers’ ...
3
Metrics
Total Citations3
research-article
June 2022
Learning transfer from singing to speech: Insights from vowel analyses in aging amateur singers and non-singers
Speech Communication (SPCO), Volume 141, Issue CPages 28–39https://doi.org/10.1016/j.specom.2022.05.001
Highlights

Articulatory space and vowel distinctiveness are independent vowel properties.
...
Abstract Purpose
Task-independent (e.g., Ballard et al., 2003) and task-dependent models (e.g., Ziegler, 2003) differ in their predictions regarding the learning transfer from non-speech activities to speech. We argue that singing is ...
0
Metrics
Total Citations0
research-article
June 2022
Perceptual effects of interpolated Austrian and German standard varieties
Speech Communication (SPCO), Volume 141, Issue CPages 107–120https://doi.org/10.1016/j.specom.2022.04.003
Highlights

Pluricentric Research on German by means of a listener judgment experiment.
...
Abstract
This article focuses on the perception of standard varieties produced by Austrian and German TV newscasters from the perspective of listeners from both countries, Germany and Austria. Thus, the paper's sociolinguistic scope is located ...
1
Metrics
Total Citations1
research-article
June 2022
Seeing lexical tone: Head and face motion in production and perception of Cantonese lexical tones
Speech Communication (SPCO), Volume 141, Issue CPages 40–55https://doi.org/10.1016/j.specom.2022.03.011
Highlights

Visual information for lexical tone is more resilient in running speech than auditory information for tone.
Abstract
Previous studies show that lexical tones can be discriminated visually, but the locus of this information is unknown. Here we investigate the role of visual face and head information in the production and perception of the six ...
0
Metrics
Total Citations0

Applied Filters

People

Names

Institutions

Authors

Publications

All Publications

Content Type

Publisher

Publication Date

Accurate synthesis of dysarthric Speech for ASR data augmentation

The effects of informational and energetic/modulation masking on the efficiency and ease of speech communication across the lifespan

The impact of non-native English speakers’ phonological and prosodic features on automatic speech recognition accuracy

The Role of Auditory and Visual Cues in the Perception of Mandarin Emotional Speech in Male Drug Addicts

Acoustic properties of non-native clear speech: Korean speakers of English

Speech emotion recognition approaches: A systematic review

Shared and task-specific phase coding characteristics of gamma- and theta-bands in speech perception and covert speech

Acoustic characterization and machine prediction of perceived masculinity and femininity in adults

Vocal characteristics of accuracy in eyewitness testimony

Effects of hearing loss and audio-visual cues on children's speech processing speed

The effect of fluency strategy training on interpreter trainees’ speech fluency: Does content familiarity matter?

Prosodic development from 4 to 10 years: Data from the Italian adaptation of the PEPS-C

The role of visual cues indicating onset times of target speech syllables in release from informational or energetic masking

Leveraging audible and inaudible signals for pronunciation training by sensing articulation through a smartphone

Arm motion symmetry in conversation

A bimodal network based on Audio–Text-Interactional-Attention with ArcFace loss for speech emotion recognition

A method for improving bot effectiveness by recognising implicit customer intent in contact centre conversations

Learning transfer from singing to speech: Insights from vowel analyses in aging amateur singers and non-singers

Perceptual effects of interpolated Austrian and German standard varieties

Seeing lexical tone: Head and face motion in production and perception of Cantonese lexical tones