Computer Science > Software Engineering

arXiv:2302.00330 (cs)

[Submitted on 1 Feb 2023]

Title:Prioritizing Speech Test Cases

Authors:Zhou Yang, Jieke Shi, Muhammad Hilmi Asyrofi, Bowen Xu, Xin Zhou, DongGyun Han, David Lo

View PDF

Abstract:With the wide adoption of automated speech recognition (ASR) systems, it is increasingly important to test and improve ASR systems. However, collecting and executing speech test cases is usually expensive and time-consuming, motivating us to strategically prioritize speech test cases. A key question is: how to determine the ideal order of collecting and executing speech test cases to uncover more errors as early as possible? Each speech test case consists of a piece of audio and the corresponding reference text. In this work, we propose PROPHET (PRiOritizing sPeecH tEsT), a tool that predicts potential error-uncovering speech test cases only based on their reference texts. Thus, PROPHET analyzes test cases and prioritizes them without running the ASR system, which can analyze speech test cases at a large scale. We evaluate 6 different prioritization methods on 3 ASR systems and 12 datasets. Given the same testing budget, we find that our approach uncovers 12.63% more wrongly recognized words than the state-of-the-art method. We select test cases from the prioritized list to fine-tune ASR systems and analyze how our approach can improve the ASR system performance. Statistical tests show that our proposed method can bring significantly larger performance improvement to ASR systems than the existing baseline methods. Furthermore, we perform correlation analysis and confirm that fine-tuning an ASR system using a dataset, on which the model performs worse, tends to improve the performance more.

Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:2302.00330 [cs.SE]
	(or arXiv:2302.00330v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2302.00330

Submission history

From: Zhou Yang [view email]
[v1] Wed, 1 Feb 2023 09:20:08 UTC (3,509 KB)

Computer Science > Software Engineering

Title:Prioritizing Speech Test Cases

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Prioritizing Speech Test Cases

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators