[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3341105.3373990acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

A smart speaker performance measurement tool

Published: 30 March 2020 Publication History

Abstract

Recently voice-controlled virtual assistants (VA) in smart speakers or smartphones have been popular. As VA provides interactive services by executing complicated processes such as speech recognition, natural language understanding, service invocation, and TTS generation jobs, its functions are performed in the cloud. However, we do not know why the response time of voice commands is slow and what is the performance bottleneck of the VA service. In this paper, we present a comprehensive VA performance measurement framework that analyzes the timing events and the response time by processing audio, video and packets. From experiments of 414 voice commands with five smart speakers and 178 commands for two VAs in smartphones, we observed that 24.9% of voice commands are completed within two seconds and 63.2% within three seconds and 36.8% of voice commands over three seconds result in poor user experiences. In particular, 96.2% of music commands and 66.7% of IoT control commands show the slow response time longer than three seconds. We found that our performance measurement tool is useful for finding the slow service such as music and news with the overhead of extracting the user intent from the voice command, the content app startup delay, and the initial playback time. Our tool shows that IoT control with a smart speaker produces the slow response time.

References

[1]
Steven Guamán, Adrián Calvopiña, Pamela Orta, Freddy Tapia, and Sang Guun Yoo. Device control system for a smart home using voice commands: A practical case. In Proceedings of the 2018 10th International Conference on Information Management and Engineering, pages 86--89. ACM, 2018.
[2]
Shih-Chieh Lin, Chang-Hong Hsu, Walter Talamonti, Yunqi Zhang, Steve Oney, Jason Mars, and Lingjia Tang. Adasa: A conversational in-vehicle digital assistant for advanced driver assistance features. In The 31st Annual ACM Symposium on User Interface Software and Technology, pages 531--542. ACM, 2018. Driverś VA implementation.
[3]
Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottrjdge. Understanding the long-term use of smart speaker assistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(3):91, 2018.
[4]
Josephine Lau, Benjamin Zimmerman, and Florian Schaub. Alexa, are you listening?: Privacy perceptions, concerns and privacy-seeking behaviors with smart speakers. Proc. ACM Hum.-Comput. Interact., 2(CSCW):102:1--102:31, November 2018.
[5]
Rickard Hjulström. Evaluation of a speech recognition system pocketsphinx, 2015.
[6]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818--2826, 2016.
[7]
Apple Machine Learning Journal. Hey siri: An on-device dnn-powered voice trigger for apple's personal assistant, https://machinelearning.apple.com/2017/10/01/hey-siri.html, 2017.
[8]
Xianghang Mi, Feng Qian, Ying Zhang, and XiaoFeng Wang. An empirical characterization of ifttt: ecosystem, usage, and performance. In Proceedings of the 2017 Internet Measurement Conference, pages 398--404. ACM, 2017.
[9]
Aung Pyae and Paul Scifleet. Investigating differences between native english and non-native english speakers in interacting with a voice user interface: a case of google home. In Proceedings of the 30th Australian Conference on Computer-Human Interaction, pages 548--553. ACM, 2018.
[10]
Seyyed Hadi Hashemi, Kyle Williams, Ahmed El Kholy, Imed Zitouni, and Paul A Crook. Measuring user satisfaction on smart speaker intelligent assistants using intent sensitive query embeddings. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 1183--1192. ACM, 2018.
[11]
Hank Liao, Golan Pundak, Olivier Siohan, Melissa K Carroll, Noah Coccaro, Qi-Ming Jiang, Tara N Sainath, Andrew Senior, Françoise Beaufays, and Michiel Bacchiani. Large vocabulary automatic speech recognition for children. In Sixteenth Annual Conference of the International Speech Communication Association, 2015.
[12]
Seyyed Hadi Hashemi, Kyle Williams, Ahmed El Kholy, Imed Zitouni, and Paul A Crook. Impact of domain and user's learning phase on task and session identification in smart speaker intelligent assistants. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 1193--1202. ACM, 2018.

Cited By

View all
  • (2024)Scalable Acoustic IoT through Composable Distributed Beamforming Tags2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00008(39-50)Online publication date: 13-May-2024
  • (2022)What Could Possibly Go Wrong When Interacting with Proactive Smart Speakers? A Case Study Using an ESM ApplicationProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517432(1-15)Online publication date: 29-Apr-2022
  • (2022)Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computingThe Journal of Supercomputing10.1007/s11227-021-03996-x78:3(3288-3324)Online publication date: 1-Feb-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
March 2020
2348 pages
ISBN:9781450368667
DOI:10.1145/3341105
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. performance measurement
  2. smart speaker
  3. virtual assistant
  4. voice command

Qualifiers

  • Research-article

Funding Sources

  • This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2016R1D1A1A09916326)
  • This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC support program(IITP-2019-2016-0-00304) supervised by the IITP

Conference

SAC '20
Sponsor:
SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing
March 30 - April 3, 2020
Brno, Czech Republic

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)9
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Scalable Acoustic IoT through Composable Distributed Beamforming Tags2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00008(39-50)Online publication date: 13-May-2024
  • (2022)What Could Possibly Go Wrong When Interacting with Proactive Smart Speakers? A Case Study Using an ESM ApplicationProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517432(1-15)Online publication date: 29-Apr-2022
  • (2022)Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computingThe Journal of Supercomputing10.1007/s11227-021-03996-x78:3(3288-3324)Online publication date: 1-Feb-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media