More Web Proxy on the site http://driver.im/

research-article

Automatic Annotation of Voice Forum Content for Rural Users and Evaluation of Relevance

Authors:

Malolan Chetlur,

Pushpendra SinghAuthors Info & Claims

COMPASS '18: Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies

Article No.: 12, Pages 1 - 11

https://doi.org/10.1145/3209811.3209875

Published: 20 June 2018 Publication History

Abstract

Voice forums are an effective intervention medium for marginalized communities to access information in a structured and localized manner. Users actively contribute by posting questions and responses in the form of audio messages, and thereby help in enriching the voice forum content. In order to build an audio library using the voice forums to disseminate information, significant manual effort is needed in analyzing and curating the data. This is one of the key impediments to the successful implementation of voice forums for knowledge dissemination and training.

In this paper, we explore the effectiveness of automated approaches to analyze and curate voice forum content in Hindi, a native language in the northern part of India. We study the use of standard techniques such as topic modeling and extractive summarization on Hindi speech transcripts (with WER of 67%) to cluster audios thematically and create summaries for individual audios respectively. These curated audios are used to build an IVR-based library for community health workers in rural India. We evaluated the relevance and preference of the automated annotation using a field trail. We find that the relevance perception varied between human and automatically generated annotations, but automatically generated summaries were still found to be useful to access the voice forum audios.

References

[1]

Fahmi Abdulhamid and Stuart Marshall. 2013. Treemaps to visualise and navigate speech audio. In Proceedings of the 25th Australian Computer-Human Interaction Conference: Augmentation, Application, Innovation, Collaboration. ACM, 555--564.

Digital Library

[2]

Eric Alexander and Michael Gleicher. 2016. Assessing topic representations for gist-forming. In Proceedings of the International Working Conference on Advanced Visual Interfaces. ACM, 100--107.

Digital Library

[3]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993--1022.

Digital Library

[4]

Sadaoki Furui, Tomonori Kikuchi, Yosuke Shinnaka, and Chiori Hori. 2004. Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Transactions on Speech and Audio Processing 12, 4 (2004), 401--408.

[5]

James Glass, Timothy J Hazen, Scot Cyphers, Igor Malioutov, David Huynh, and Regina Barzilay. 2007. Recent progress in the MIT spoken lecture processing project. In Eighth Annual Conference of the International Speech Communication Association.

[6]

Google. 2018. Google Cloud Speech API. https://cloud.google.com/speech/reference/rest/ {Online; accessed 02-March-2018}.

[7]

Masataka Goto, Jun Ogata, and Kouichirou Eto. 2007. Podcastle: A web 2.0 approach to speech recognition research. In Eighth Annual Conference of the International Speech Communication Association.

[8]

Gramvaani. 2018. How Mobile Vaani Works. http://gramvaani.org/?page_id=15 {Online; accessed 19-April-2018}.

[9]

Guillaume Gravier, Nathan Souviraa-Labastie, Sébastien Campion, and Frédéric Bimbot. 2014. Audio thumbnails for spoken content without transcription based on a maximum motif coverage criterion. In Annual Conference of the International Speech Communication Association.

[10]

Sheng-yi Kong, Miao-ru Wu, Che-kuang Lin, Yi-sheng Fu, and Lin-shan Lee. 2009. Learning on demand-course lecture distillation by information extraction and semantic structuring for spoken documents. In Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. IEEE, 4709--4712.

Digital Library

[11]

Zahir Koradia, Piyush Aggarwal, Aaditeshwar Seth, and Gaurav Luthra. 2013. Gurgaon idol: a singing competition over community radio and IVRS. In Proceedings of the 3rd ACM Symposium on Computing for Development. ACM, 6.

Digital Library

[12]

Zahir Koradia, Goutham Mannava, Aravindh Raman, Gaurav Aggarwal, Vinay Ribeiro, Aaditeshwar Seth, Sebastian Ardon, Anirban Mahanti, and Sipat Triukose. 2013. First impressions on the state of cellular data connectivity in India. In Proceedings of the 4th Annual Symposium on Computing for Development. ACM, 3.

Digital Library

[13]

Sameer Maskey and Julia Hirschberg. 2005. Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization. In Ninth European Conference on Speech Communication and Technology.

[14]

George A Miller. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological review 63, 2 (1956), 81.

[15]

David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 262--272.

Digital Library

[16]

Government of India Ministry of Health & Family Welfare. 2018. About ASHA. http://nhm.gov.in/communitisation/asha/about-asha.html {Online; accessed 2-March-2018}.

[17]

Preeti Mudliar, Jonathan Donner, and William Thies. 2012. Emergent practices around CGNet Swara, voice forum for citizen journalism in rural India. In Proceedings of the Fifth International Conference on Information and Communication Technologies and Development. ACM, 159--168.

Digital Library

[18]

David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 100--108.

Digital Library

[19]

Jun Ogata and Masataka Goto. 2009. PodCastle: Collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription. In Tenth Annual Conference of the International Speech Communication Association.

[20]

Neil Patel, Deepti Chittamuru, Anupam Jain, Paresh Dave, and Tapan S Parikh. 2010. Avaaj otalo: a field study of an interactive voice forum for small farmers in rural india. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 733--742.

Digital Library

[21]

Agha Ali Raza, Farhan Ul Haq, Zain Tariq, Mansoor Pervaiz, Samia Razaq, Umar Saif, and Roni Rosenfeld. 2013. Job opportunities through entertainment: Virally spread speech-based services for low-literate users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2803--2812.

Digital Library

[22]

Siva Reddy and Serge Sharoff. 2011. Cross Language POS Taggers (and other Tools) for Indian Languages: An Experiment with Kannada using Telugu Resources. In Proceedings of the Fifth International Workshop On Cross Lingual Information Access. Asian Federation of Natural Language Processing, Chiang Mai, Thailand, 11--19. http://www.aclweb.org/anthology/W11-3603

[23]

Jahanzeb Sherwani, Nosheen Ali, Sarwat Mirza, Anjum Fatma, Yousuf Memon, Mehtab Karim, Rahul Tongia, and Roni Rosenfeld. 2007. Healthline: Speech-based access to health information by low-literate users. In Information and Communication Technologies and Development, 2007. ICTD 2007. International Conference on. IEEE, 1--9.

[24]

Damiano Spina, Johanne R Trippas, Lawrence Cavedon, and Mark Sanderson. 2017. Extracting audio summaries to support effective spoken document search. Journal of the Association for Information Science and Technology 68, 9 (2017), 2101--2115.

Digital Library

[25]

India SWACH, Panchkula. 2018. Survival for Women & Children Foundation. http://www.swach.org/ {Online; accessed 2-March-2018}.

[26]

Aditya Vashistha, Edward Cutrell, Gaetano Borriello, and William Thies. 2015. Sangeet swara: A community-moderated voice forum in rural india. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 417--426.

Digital Library

[27]

Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2017. Respeak: A Voice-based, Crowd-powered Speech Transcription System. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 1855--1866.

Digital Library

[28]

Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. 2010. Tiara: a visual exploratory text analytic system. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 153--162.

Digital Library

[29]

Shasha Xie and Yang Liu. 2010. Improving supervised learning for meeting summarization using sampling and regression. Computer Speech & Language 24, 3 (2010), 495--514.

Digital Library

[30]

Deepika Yadav, Pushpendra Singh, Kyle Montague, Vijay Kumar, Deepak Sood, Madeline Balaam, Drishti Sharma, Mona Duggal, Tom Bartindale, Delvin Varghese, et al. 2017. Sangoshthi: Empowering Community Health Workers through Peer Learning in Rural India. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Commitee, 499--508.

Digital Library

[31]

Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1445--1456.

Digital Library

[32]

Jian Zhang and Pascale Fung. 2007. Speech summarization without lexical features for Mandarin broadcast news. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. Association for Computational Linguistics, 213--216.

Digital Library

[33]

Justin Jian Zhang and Pascale Fung. 2012. Active learning with semi-automatic annotation for extractive speech summarization. ACM Transactions on Speech and Language Processing (TSLP) 8, 4 (2012), 6.

Digital Library

Cited By

Kaur JSharma PKumar VDuggal MDiamond-Smith NEl Ayadi AVosburg KSingh P(2024)Exploring the Role of Chatbots in Tackling COVID-19 Vaccine Hesitancy among Pregnant and Breastfeeding Women in Rural Northern IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36373328:CSCW1(1-29)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637332
Zhao X(2023)Research on Methods and Applications Related to Question-and-Answer Dialogue SystemsHighlights in Science, Engineering and Technology10.54097/hset.v57i.988557(9-14)Online publication date: 11-Jul-2023
https://doi.org/10.54097/hset.v57i.9885
Saha MBartindale TSultana SOliver GRichardson DThilsted SAhmed SOlivier P(2022)Commissioning Development: Grantmaking, Community Voices, and their Implications for ICTDProceedings of the 2022 International Conference on Information and Communication Technologies and Development10.1145/3572334.3572402(1-18)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3572334.3572402
Show More Cited By

Index Terms

Automatic Annotation of Voice Forum Content for Rural Users and Evaluation of Relevance
1. Human-centered computing
  1. Accessibility
    1. Accessibility systems and tools
2. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
      1. Relevance assessment
  2. Information systems applications
    1. Digital libraries and archives

Recommendations

Sangeet Swara: A Community-Moderated Voice Forum in Rural India
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems

Interactive voice forums have emerged as a promising platform for people in developing regions to record and share audio messages using low-end mobile phones. However, one of the barriers to the scalability of voice forums is the process of screening ...
Emergent practices around CGNet Swara, voice forum for citizen journalism in rural India
ICTD '12: Proceedings of the Fifth International Conference on Information and Communication Technologies and Development

Rural communities in India are often underserved by the mainstream media. While there is a public discourse surrounding the issues they face, this dialogue typically takes place on television, in newspaper editorials, and on the Internet. Unfortunately, ...
Recordkeeping in Voice-based Remote Community Engagement
CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Driven by pragmatic, cost-related, and environmental factors, voice-based remote community engagement tools (such as Interactive Voice Response) are emerging as a key modality for engaging marginalized communities. These voice-based digital solutions ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

COMPASS '18: Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies

June 2018

472 pages

ISBN:9781450358163

DOI:10.1145/3209811

Conference Chair:
Ellen Zegura
Georgia Tech

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCAS: ACM Special Interest Group on Computers and Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

COMPASS '18

Sponsor:

SIGCAS

COMPASS '18: ACM SIGCAS Conference on Computing and Sustainable Societies

June 20 - 22, 2018

CA, Menlo Park and San Jose, USA

Acceptance Rates

Overall Acceptance Rate 25 of 50 submissions, 50%

Upcoming Conference

COMPASS '25

Sponsor:
sigcas
sigcas

ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies

July 22 - 25, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
175
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kaur JSharma PKumar VDuggal MDiamond-Smith NEl Ayadi AVosburg KSingh P(2024)Exploring the Role of Chatbots in Tackling COVID-19 Vaccine Hesitancy among Pregnant and Breastfeeding Women in Rural Northern IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36373328:CSCW1(1-29)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637332
Zhao X(2023)Research on Methods and Applications Related to Question-and-Answer Dialogue SystemsHighlights in Science, Engineering and Technology10.54097/hset.v57i.988557(9-14)Online publication date: 11-Jul-2023
https://doi.org/10.54097/hset.v57i.9885
Saha MBartindale TSultana SOliver GRichardson DThilsted SAhmed SOlivier P(2022)Commissioning Development: Grantmaking, Community Voices, and their Implications for ICTDProceedings of the 2022 International Conference on Information and Communication Technologies and Development10.1145/3572334.3572402(1-18)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3572334.3572402
Khullar APanjal PPandey RBurnwal ARaj PJha AHitesh PReddy RHimanshu HSeth A(2021)Experiences with the Introduction of AI-based Tools for Moderation Automation of Voice-based Participatory Media ForumProceedings of the 12th Indian Conference on Human-Computer Interaction10.1145/3506469.3506473(30-39)Online publication date: 19-Nov-2021
https://dl.acm.org/doi/10.1145/3506469.3506473
Khullar ASantosh MKumar PRahman STripathi RKumar DSaini SPandey RSeth A(2021)Early Results from Automating Voice-based Question-Answering Services Among Low-income Populations in IndiaProceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3460112.3471946(79-87)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3460112.3471946
Yadav DMalik PDabas KSingh PKitamura YQuigley AIsbister KIgarashi TBjørn PDrucker S(2021)Illustrating the Gaps and Needs in the Training Support of Community Health Workers in IndiaProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445111(1-16)Online publication date: 6-May-2021
https://dl.acm.org/doi/10.1145/3411764.3445111
Pandey AMutreja IBrar SSingh P(2020)Exploring Automated Q&A Support System for Maternal and Child Health in Rural IndiaProceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3378393.3402281(349-350)Online publication date: 15-Jun-2020
https://dl.acm.org/doi/10.1145/3378393.3402281

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents