[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3209811.3209875acmconferencesArticle/Chapter ViewAbstractPublication PagescompassConference Proceedingsconference-collections
research-article

Automatic Annotation of Voice Forum Content for Rural Users and Evaluation of Relevance

Published: 20 June 2018 Publication History

Abstract

Voice forums are an effective intervention medium for marginalized communities to access information in a structured and localized manner. Users actively contribute by posting questions and responses in the form of audio messages, and thereby help in enriching the voice forum content. In order to build an audio library using the voice forums to disseminate information, significant manual effort is needed in analyzing and curating the data. This is one of the key impediments to the successful implementation of voice forums for knowledge dissemination and training.
In this paper, we explore the effectiveness of automated approaches to analyze and curate voice forum content in Hindi, a native language in the northern part of India. We study the use of standard techniques such as topic modeling and extractive summarization on Hindi speech transcripts (with WER of 67%) to cluster audios thematically and create summaries for individual audios respectively. These curated audios are used to build an IVR-based library for community health workers in rural India. We evaluated the relevance and preference of the automated annotation using a field trail. We find that the relevance perception varied between human and automatically generated annotations, but automatically generated summaries were still found to be useful to access the voice forum audios.

References

[1]
Fahmi Abdulhamid and Stuart Marshall. 2013. Treemaps to visualise and navigate speech audio. In Proceedings of the 25th Australian Computer-Human Interaction Conference: Augmentation, Application, Innovation, Collaboration. ACM, 555--564.
[2]
Eric Alexander and Michael Gleicher. 2016. Assessing topic representations for gist-forming. In Proceedings of the International Working Conference on Advanced Visual Interfaces. ACM, 100--107.
[3]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993--1022.
[4]
Sadaoki Furui, Tomonori Kikuchi, Yosuke Shinnaka, and Chiori Hori. 2004. Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Transactions on Speech and Audio Processing 12, 4 (2004), 401--408.
[5]
James Glass, Timothy J Hazen, Scot Cyphers, Igor Malioutov, David Huynh, and Regina Barzilay. 2007. Recent progress in the MIT spoken lecture processing project. In Eighth Annual Conference of the International Speech Communication Association.
[6]
Google. 2018. Google Cloud Speech API. https://cloud.google.com/speech/reference/rest/ {Online; accessed 02-March-2018}.
[7]
Masataka Goto, Jun Ogata, and Kouichirou Eto. 2007. Podcastle: A web 2.0 approach to speech recognition research. In Eighth Annual Conference of the International Speech Communication Association.
[8]
Gramvaani. 2018. How Mobile Vaani Works. http://gramvaani.org/?page_id=15 {Online; accessed 19-April-2018}.
[9]
Guillaume Gravier, Nathan Souviraa-Labastie, Sébastien Campion, and Frédéric Bimbot. 2014. Audio thumbnails for spoken content without transcription based on a maximum motif coverage criterion. In Annual Conference of the International Speech Communication Association.
[10]
Sheng-yi Kong, Miao-ru Wu, Che-kuang Lin, Yi-sheng Fu, and Lin-shan Lee. 2009. Learning on demand-course lecture distillation by information extraction and semantic structuring for spoken documents. In Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. IEEE, 4709--4712.
[11]
Zahir Koradia, Piyush Aggarwal, Aaditeshwar Seth, and Gaurav Luthra. 2013. Gurgaon idol: a singing competition over community radio and IVRS. In Proceedings of the 3rd ACM Symposium on Computing for Development. ACM, 6.
[12]
Zahir Koradia, Goutham Mannava, Aravindh Raman, Gaurav Aggarwal, Vinay Ribeiro, Aaditeshwar Seth, Sebastian Ardon, Anirban Mahanti, and Sipat Triukose. 2013. First impressions on the state of cellular data connectivity in India. In Proceedings of the 4th Annual Symposium on Computing for Development. ACM, 3.
[13]
Sameer Maskey and Julia Hirschberg. 2005. Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization. In Ninth European Conference on Speech Communication and Technology.
[14]
George A Miller. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological review 63, 2 (1956), 81.
[15]
David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 262--272.
[16]
Government of India Ministry of Health & Family Welfare. 2018. About ASHA. http://nhm.gov.in/communitisation/asha/about-asha.html {Online; accessed 2-March-2018}.
[17]
Preeti Mudliar, Jonathan Donner, and William Thies. 2012. Emergent practices around CGNet Swara, voice forum for citizen journalism in rural India. In Proceedings of the Fifth International Conference on Information and Communication Technologies and Development. ACM, 159--168.
[18]
David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 100--108.
[19]
Jun Ogata and Masataka Goto. 2009. PodCastle: Collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription. In Tenth Annual Conference of the International Speech Communication Association.
[20]
Neil Patel, Deepti Chittamuru, Anupam Jain, Paresh Dave, and Tapan S Parikh. 2010. Avaaj otalo: a field study of an interactive voice forum for small farmers in rural india. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 733--742.
[21]
Agha Ali Raza, Farhan Ul Haq, Zain Tariq, Mansoor Pervaiz, Samia Razaq, Umar Saif, and Roni Rosenfeld. 2013. Job opportunities through entertainment: Virally spread speech-based services for low-literate users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2803--2812.
[22]
Siva Reddy and Serge Sharoff. 2011. Cross Language POS Taggers (and other Tools) for Indian Languages: An Experiment with Kannada using Telugu Resources. In Proceedings of the Fifth International Workshop On Cross Lingual Information Access. Asian Federation of Natural Language Processing, Chiang Mai, Thailand, 11--19. http://www.aclweb.org/anthology/W11-3603
[23]
Jahanzeb Sherwani, Nosheen Ali, Sarwat Mirza, Anjum Fatma, Yousuf Memon, Mehtab Karim, Rahul Tongia, and Roni Rosenfeld. 2007. Healthline: Speech-based access to health information by low-literate users. In Information and Communication Technologies and Development, 2007. ICTD 2007. International Conference on. IEEE, 1--9.
[24]
Damiano Spina, Johanne R Trippas, Lawrence Cavedon, and Mark Sanderson. 2017. Extracting audio summaries to support effective spoken document search. Journal of the Association for Information Science and Technology 68, 9 (2017), 2101--2115.
[25]
India SWACH, Panchkula. 2018. Survival for Women & Children Foundation. http://www.swach.org/ {Online; accessed 2-March-2018}.
[26]
Aditya Vashistha, Edward Cutrell, Gaetano Borriello, and William Thies. 2015. Sangeet swara: A community-moderated voice forum in rural india. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 417--426.
[27]
Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2017. Respeak: A Voice-based, Crowd-powered Speech Transcription System. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 1855--1866.
[28]
Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. 2010. Tiara: a visual exploratory text analytic system. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 153--162.
[29]
Shasha Xie and Yang Liu. 2010. Improving supervised learning for meeting summarization using sampling and regression. Computer Speech & Language 24, 3 (2010), 495--514.
[30]
Deepika Yadav, Pushpendra Singh, Kyle Montague, Vijay Kumar, Deepak Sood, Madeline Balaam, Drishti Sharma, Mona Duggal, Tom Bartindale, Delvin Varghese, et al. 2017. Sangoshthi: Empowering Community Health Workers through Peer Learning in Rural India. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Commitee, 499--508.
[31]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1445--1456.
[32]
Jian Zhang and Pascale Fung. 2007. Speech summarization without lexical features for Mandarin broadcast news. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. Association for Computational Linguistics, 213--216.
[33]
Justin Jian Zhang and Pascale Fung. 2012. Active learning with semi-automatic annotation for extractive speech summarization. ACM Transactions on Speech and Language Processing (TSLP) 8, 4 (2012), 6.

Cited By

View all
  • (2024)Exploring the Role of Chatbots in Tackling COVID-19 Vaccine Hesitancy among Pregnant and Breastfeeding Women in Rural Northern IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36373328:CSCW1(1-29)Online publication date: 26-Apr-2024
  • (2023)Research on Methods and Applications Related to Question-and-Answer Dialogue SystemsHighlights in Science, Engineering and Technology10.54097/hset.v57i.988557(9-14)Online publication date: 11-Jul-2023
  • (2022)Commissioning Development: Grantmaking, Community Voices, and their Implications for ICTDProceedings of the 2022 International Conference on Information and Communication Technologies and Development10.1145/3572334.3572402(1-18)Online publication date: 27-Jun-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
COMPASS '18: Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies
June 2018
472 pages
ISBN:9781450358163
DOI:10.1145/3209811
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Community Health Workers
  2. HCI4D
  3. ICT4D
  4. IVR
  5. Interactive Voice Response
  6. Speech Summarization
  7. Topic Modeling

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

COMPASS '18
Sponsor:
COMPASS '18: ACM SIGCAS Conference on Computing and Sustainable Societies
June 20 - 22, 2018
CA, Menlo Park and San Jose, USA

Acceptance Rates

Overall Acceptance Rate 25 of 50 submissions, 50%

Upcoming Conference

COMPASS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring the Role of Chatbots in Tackling COVID-19 Vaccine Hesitancy among Pregnant and Breastfeeding Women in Rural Northern IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36373328:CSCW1(1-29)Online publication date: 26-Apr-2024
  • (2023)Research on Methods and Applications Related to Question-and-Answer Dialogue SystemsHighlights in Science, Engineering and Technology10.54097/hset.v57i.988557(9-14)Online publication date: 11-Jul-2023
  • (2022)Commissioning Development: Grantmaking, Community Voices, and their Implications for ICTDProceedings of the 2022 International Conference on Information and Communication Technologies and Development10.1145/3572334.3572402(1-18)Online publication date: 27-Jun-2022
  • (2021)Experiences with the Introduction of AI-based Tools for Moderation Automation of Voice-based Participatory Media ForumProceedings of the 12th Indian Conference on Human-Computer Interaction10.1145/3506469.3506473(30-39)Online publication date: 19-Nov-2021
  • (2021)Early Results from Automating Voice-based Question-Answering Services Among Low-income Populations in IndiaProceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3460112.3471946(79-87)Online publication date: 28-Jun-2021
  • (2021)Illustrating the Gaps and Needs in the Training Support of Community Health Workers in IndiaProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445111(1-16)Online publication date: 6-May-2021
  • (2020)Exploring Automated Q&A Support System for Maternal and Child Health in Rural IndiaProceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3378393.3402281(349-350)Online publication date: 15-Jun-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media