[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3441852.3471201acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

The Efficacy of Collaborative Authoring of Video Scene Descriptions

Published: 17 October 2021 Publication History

Abstract

The majority of online video contents remain inaccessible to people with visual impairments due to the lack of audio descriptions to depict the video scenes. Content creators have traditionally relied on professionals to author audio descriptions, but their service is costly and not readily-available. We investigate the feasibility of creating more cost-effective audio descriptions that are also of high quality by involving novices. Specifically, we designed, developed, and evaluated ViScene, a web-based collaborative audio description authoring tool that enables a sighted novice author and a reviewer either sighted or blind to interact and contribute to scene descriptions (SDs)—text that can be transformed into audio through text-to-speech. Through a mixed-design study with N = 60 participants, we assessed the quality of SDs created by sighted novices with feedback from both sighted and blind reviewers. Our results showed that with ViScene novices could produce content that is Descriptive, Objective, Referable, and Clear at a cost of i.e., US$2.81pvm to US$5.48pvm, which is 54% to 96% lower than the professional service. However, the descriptions lacked in other quality dimensions (e.g., learning, a measure of how well an SD conveys the video’s intended message). While professional audio describers remain the gold standard, for content creators who cannot afford it, ViScene offers a cost-effective alternative, ultimately leading to a more accessible medium.

Supplementary Material

VTT File (6079.vtt)
Supplemental materials (6079-file2.zip)
MP4 File (6079.mp4)
Presentation video

References

[1]
3PlayMedia. 2020. Beginner’s Guid to Audio Description. https://go.3playmedia.com/hubfs/WP%20PDFs/Beginners-Guide-to-Audio-Description.pdf. Accessed: 2021-01-13.
[2]
Amazon. 2020. Amazon Polly. https://aws.amazon.com/polly/. Accessed: 2020-06-01.
[3]
Myriah Anderson. 2020. The 13 most popular types of videos on YouTube [Infographic]. https://www.impactplus.com/blog/most-popular-types-of-videos-on-youtube-infographic. Accessed: 2020-11-6.
[4]
Fabricio Balcazar, Bill L Hopkins, and Yolanda Suarez. 1985. A critical, objective review of performance feedback. Journal of Organizational Behavior Management 7, 3-4(1985), 65–89.
[5]
Stacy M Branham and Shaun K Kane. 2015. Collaborative accessibility: How blind and sighted companions co-create accessible home spaces. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 2373–2382.
[6]
Stacy M Branham and Shaun K Kane. 2015. The invisible work of accessibility: how blind employees manage accessibility in mixed-ability workplaces. In Proceedings of the 17th international acm sigaccess conference on computers & accessibility. 163–171.
[7]
Carmen J Branje and Deborah I Fels. 2012. Livedescribe: can amateur describers create high-quality audio description?Journal of Visual Impairment & Blindness 106, 3 (2012), 154–165.
[8]
Sabine Braun. 2011. Creating coherence in audio description. Meta: Journal des traducteurs/Meta: Translators’ Journal 56, 3(2011), 645–662.
[9]
Ben Caldwell, Michael Cooper, Loretta Guarino Reid, Gregg Vanderheiden, Wendy Chisholm, John Slatin, and Jason White. 2008. Web content accessibility guidelines (WCAG) 2.0. WWW Consortium (W3C)(2008).
[10]
Virginia P Campos, Tiago MU de Araújo, Guido L de Souza Filho, and Luiz MG Gonçalves. 2020. CineAD: a system for automated audio description script generation for the visually impaired. Universal Access in the Information Society 19, 1 (2020), 99–111.
[11]
J. Clement. 2019. Hours of video uploaded to YouTube every minute as of May 2019. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute##statisticContainer. Accessed: 2020-11-5.
[12]
Audio Description Coalition. 2009. Standards for Audio Description and Code of Professional Conduct for Describers. https://audiodescriptionsolutions.com/wp-content/uploads/2016/06/adc_standards_090615.pdf. Accessed: 2020-11-6.
[13]
Comcast. 2020. Comcast 2020 Network Report. https://update.comcast.com/wp-content/uploads/sites/33/dlm_uploads/2021/02/network-report-2020.pdf.
[14]
Federal Communications Commission. 2020. 21st Century Communications and Video Accessibility Act (CVAA). https://www.fcc.gov/consumers/guides/21st-century-communications-and-video-accessibility-act-cvaa. Accessed: 2020-11-6.
[15]
Described and Captioned Media Program. 2020. Described and Captioned Media Program (DCMP). http://www.descriptionkey.org/quality_description.html. Accessed: 2019-03-19.
[16]
Rikkie Donachie. 2013. It’s Just a Bit of Paper (Animal, Birds & Cards, Volume 1). CreateSpace Independent Publisihing Platform.
[17]
Louise Fryer. 2016. An introduction to audio description: A practical guide. Routledge.
[18]
Langis Gagnon, Claude Chapdelaine, David Byrns, Samuel Foucher, Maguelonne Heritier, and Vishwa Gupta. 2010. A computer-vision-assisted system for video description scripting. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. IEEE, 41–48.
[19]
GBH. 2020. CADET - Caption and Descriptive Editing Tool. https://www.wgbh.org/foundation/what-we-do/ncam/cadet. Accessed: 2020-11-6.
[20]
James Hale. 2019. More Than 500 Hours Of Content Are Now Being Uploaded To YouTube Every Minute. https://www.tubefilter.com/2019/05/07/number-hours-video-uploaded-to-youtube-per-minute/. Accessed: 2020-11-5.
[21]
John Hattie and Helen Timperley. 2007. The power of feedback. Review of educational research 77, 1 (2007), 81–112.
[22]
World-Wide Web COnsortium Web Accessibility Initiative. 2016. Making the Web-Accessible. https://www.w3.org/WAI/. Accessed: 2020-11-6.
[23]
Hernisa Kacorri, Sergio Mascetti, Andrea Gerino, Dragan Ahmetovic, Valeria Alampi, Hironobu Takagi, and Chieko Asakawa. 2018. Insights on Assistive Orientation and Mobility of People with Visual Impairment Based on Large-Scale Longitudinal Data. ACM Trans. Access. Comput. 11, 1, Article 5 (March 2018), 28 pages. https://doi.org/10.1145/3178853
[24]
Avraham N Kluger and Angelo DeNisi. 1996. The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory.Psychological bulletin 119, 2 (1996), 254.
[25]
Masatomo Kobayashi, Kentarou Fukuda, Hironobu Takagi, and Chieko Asakawa. 2009. Providing synthesized audio description for online videos. In Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility. 249–250.
[26]
Masatomo Kobayashi, Trisha O’Connell, Bryan Gould, Hironobu Takagi, and Chieko Asakawa. 2010. Are synthesized video descriptions acceptable?. In Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility. 163–170.
[27]
Chinmay E Kulkarni, Michael S Bernstein, and Scott R Klemmer. 2015. PeerStudio: rapid peer feedback emphasizes revision and improves performance. In Proceedings of the second (2015) ACM conference on learning@ scale. 75–84.
[28]
James Lakritz and Andrew Salway. 2006. The semi-automatic generation of audio description from screenplays. Dept. of Computing Technical Report CS-06-05, University of Surrey (2006).
[29]
Kyungjun Lee, Daisuke Sato, Saki Asakawa, Hernisa Kacorri, and Chieko Asakawa. 2020. Pedestrian Detection with Wearable Cameras for the Blind: A Two-Way Perspective(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376398
[30]
Hoi Ching Dawning Leung. 2018. Audio description of audiovisual programmes for the visually impaired in Hong Kong. Ph.D. Dissertation. UCL (University College London).
[31]
3Play Media. 2020. 3Play Plugin. https://www.3playmedia.com/services/features/plugins/3play-plugin/. Accessed: 2020-11-6.
[32]
Mediakix. 2019. The Most Popular Types of YouTube Video. https://mediakix.com/blog/most-popular-youtube-videos/. Accessed: 2020-11-6.
[33]
Meredith Ringel Morris, Jazette Johnson, Cynthia L Bennett, and Edward Cutrell. 2018. Rich representations of visual content for screen reader users. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11.
[34]
Rosiana Natalie, Ebrima Jarjue, Hernisa Kacorri, and Kotaro Hara. 2020. ViScene: A Collaborative Authoring Tool for Scene Descriptions in Videos. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1–4.
[35]
Netflix. 2020. Audio Description Style Guide v2.1. https://partnerhelp.netflixstudios.com/hc/en-us/articles/215510667-Audio-Description-Style-Guide-v2-1. Accessed: 2020-11-6.
[36]
U.S. Department of Labor. 2021. Minimum Wage. https://www.dol.gov/general/topic/wages/minimumwage. Accessed: 2021-1-13.
[37]
American Council of the Blind. 2020. Sample of Audio Description. https://www.acb.org/adp/samples.html. Accessed: 2020-11-6.
[38]
American Council of the Blind. 2021. Audio Description using the Web Speech API. https://acb.org/adp/education.html. Accessed: 2020-01-13.
[39]
Jaclyn Packer, Katie Vizenor, and Joshua A Miele. 2015. An overview of video description: history, benefits, and guidelines. Journal of Visual Impairment & Blindness 109, 2 (2015), 83–93.
[40]
Jamie Pauls. 2016. Audio Description Comes to Netflix. https://www.afb.org/aw/16/7/15436. Accessed: 2021-1-13.
[41]
Jamie Pauls. 2016. Netflix Audio Description: What a Difference a Year Makes. https://www.afb.org/aw/17/7/15312. Accessed: 2021-1-13.
[42]
Amy Pavel, Gabriel Reyes, and Jeffrey P Bigham. 2020. Rescribe: Authoring and Automatically Editing Audio Descriptions. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 747–759.
[43]
Able Player. 2020. Able Player: Fuly Accessible cross-browser HTML Media Player. https://www.3playmedia.com/services/features/plugins/3play-plugin/. Accessed: 2020-11-6.
[44]
John M Slatin. 2001. The art of ALT: toward a more accessible Web. Computers and Composition 18, 1 (2001), 73–81.
[45]
Terril Thompson. 2017. My Audio Description Talk @ CSUN). https://terrillthompson.com/813. Accessed: 2020-11-6.
[46]
Terril Thompson. 2019. Audio Description using the Web Speech API. https://terrillthompson.com/1173. Accessed: 2020-11-6.
[47]
Veroniiiica. 2019. How to Create Audio Description for YouTube with YouDescribe. https://www.perkinselearning.org/technology/blog/how-create-audio-description-youtube-youdescribe. Accessed: 2021-7-2.
[48]
Agnieszka Walczak and Louise Fryer. 2018. Vocal delivery of audio description by genre: measuring users’ presence. Perspectives 26, 1 (2018), 69–83.
[49]
Yujia Wang, Wei Liang, Haikun Huang, Yongqi Zhang, Dingzeyu Li, and Lap-Fai Yu. 2021. Toward Automatic Audio Description Generation for Accessible Videos. (2021).
[50]
YouDescribe. 2020. YouDescribe. https://youdescribe.org/support/tutorial. Accessed: 2020-11-6.
[51]
Beste F Yuksel, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Soo Jung Kim, Joshua Junhee Lee, Seung Jung Jin, Yue-Ting Siu, Joshua A Miele, and Ilmi Yoon. 2020. Human-in-the-Loop Machine Learning to Increase Video Accessibility for Visually Impaired and Blind Users. In Proceedings of the 2020 ACM Designing Interactive Systems Conference. 47–60.
[52]
Beste F Yuksel, Soo Jung Kim, Seung Jung Jin, Joshua Junhee Lee, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Ilmi Yoon, Yue-Ting Siu, and Joshua A Miele. 2020. Increasing Video Accessibility for Visually Impaired Users with Human-in-the-Loop Machine Learning. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–9.

Cited By

View all
  • (2024)Design considerations for photosensitivity warnings in visual mediaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675643(1-12)Online publication date: 27-Oct-2024
  • (2024)Towards Accessible Musical Performances in Virtual Reality: Designing a Conceptual Framework for Omnidirectional Audio DescriptionsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675618(1-17)Online publication date: 27-Oct-2024
  • (2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility
October 2021
730 pages
ISBN:9781450383066
DOI:10.1145/3441852
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Scene description
  2. video accessibility
  3. visual impairment

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ASSETS '21
Sponsor:

Acceptance Rates

ASSETS '21 Paper Acceptance Rate 36 of 134 submissions, 27%;
Overall Acceptance Rate 436 of 1,556 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)11
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Design considerations for photosensitivity warnings in visual mediaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675643(1-12)Online publication date: 27-Oct-2024
  • (2024)Towards Accessible Musical Performances in Virtual Reality: Designing a Conceptual Framework for Omnidirectional Audio DescriptionsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675618(1-17)Online publication date: 27-Oct-2024
  • (2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
  • (2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024
  • (2024)Making Short-Form Videos Accessible with Hierarchical Video SummariesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642839(1-17)Online publication date: 11-May-2024
  • (2024)“It’s Kind of Context Dependent”: Understanding Blind and Low Vision People’s Video Accessibility Preferences Across Viewing ScenariosProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642238(1-20)Online publication date: 11-May-2024
  • (2024)A Systematic Review of Ability-diverse Collaboration through Ability-based Lens in HCIProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641930(1-21)Online publication date: 11-May-2024
  • (2023)Understanding Strategies and Challenges of Conducting Daily Data Analysis (DDA) Among Blind and Low-vision PeopleProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608423(1-15)Online publication date: 22-Oct-2023
  • (2023)Understanding Challenges and Opportunities in Body Movement Education of People who are Blind or have Low VisionProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608409(1-19)Online publication date: 22-Oct-2023
  • (2023)Beyond Audio Description: Exploring 360° Video Accessibility with Blind and Low Vision Users Through Collaborative CreationProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608381(1-17)Online publication date: 22-Oct-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media