[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3172944.3173149acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
abstract
Public Access

Leveraging User Input and Feedback for Interactive Sound Event Detection and Annotation

Published: 05 March 2018 Publication History

Abstract

Tagging of environment audio events is essential in many areas. However, finding sound events and labeling them within a long audio file is tedious and time-consuming. Building an automatic recognition system using modern machine learning is often not feasible because it requires a large number of human-labeled training examples and it is not reliable enough for all uses. I propose interactive sound event detection to solve the issue by combining machine search with human tagging, specifically focusing on the effectiveness of various types of user-inputs to the interactive sound searching. The types of user inputs that I will explore include binary relevance feedback, segmentation, and vocal imitation. I expect that leveraging one or combination of these user inputs would help users find audio contents of interest quickly and accurately, even in the situation where there are not enough training examples for a typical automated system.

References

[1]
Saleema Amershi, James Fogarty, Ashish Kapoor, and Desney Tan. 2011. Effective End-user Interaction with Machine Learning. In Proc. of the AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, 1529--1532.
[2]
Sébastien Gulluni, Slim Essid, Olivier Buisson, and Gaël Richard. 2011. An interactive system for electro-Acoustic music analysis. In Proc. of the International Society for Music Information Retrieval Conference (ISMIR). 145--150.
[3]
Bongjun Kim and Bryan Pardo. 2017. I-SED: An Interactive Sound Event Detector. In Proc. of the International Conference on Intelligent User Interfaces (IUI). ACM, 553--557.
[4]
Giambattista Parascandolo, Heikki Huttunen, and Tuomas Virtanen. 2016. Recurrent neural networks for polyphonic sound event detection in real life recordings. In Proc. of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6440--6444.
[5]
Burr Settles. 2011. Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances. In Proc. of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 1467--1478.

Cited By

View all
  • (2023)SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesProceedings of the ACM on Human-Computer Interaction10.1145/36042737:MHCI(1-19)Online publication date: 13-Sep-2023
  • (2021)Extracting Urban Sound Information for Residential Areas in Smart Cities Using an End-to-End IoT SystemIEEE Internet of Things Journal10.1109/JIOT.2021.30687558:18(14308-14321)Online publication date: 15-Sep-2021
  • (2019)Look, Listen, and Learn More: Design Choices for Deep Audio EmbeddingsICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682475(3852-3856)Online publication date: May-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces
March 2018
698 pages
ISBN:9781450349451
DOI:10.1145/3172944
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2018

Check for updates

Author Tags

  1. human-in-the-loop system
  2. interactive machine learning
  3. sound event detection

Qualifiers

  • Abstract

Funding Sources

Conference

IUI'18
Sponsor:

Acceptance Rates

IUI '18 Paper Acceptance Rate 43 of 299 submissions, 14%;
Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesProceedings of the ACM on Human-Computer Interaction10.1145/36042737:MHCI(1-19)Online publication date: 13-Sep-2023
  • (2021)Extracting Urban Sound Information for Residential Areas in Smart Cities Using an End-to-End IoT SystemIEEE Internet of Things Journal10.1109/JIOT.2021.30687558:18(14308-14321)Online publication date: 15-Sep-2021
  • (2019)Look, Listen, and Learn More: Design Choices for Deep Audio EmbeddingsICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682475(3852-3856)Online publication date: May-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media