More Web Proxy on the site http://driver.im/

research-article

Crowdsourced Monolingual Translation

Authors:

Benjamin B. BedersonAuthors Info & Claims

ACM Transactions on Computer-Human Interaction (TOCHI), Volume 21, Issue 4

Article No.: 22, Pages 1 - 35

https://doi.org/10.1145/2627751

Published: 01 August 2014 Publication History

Abstract

An enormous potential exists for solving certain classes of computational problems through rich collaboration among crowds of humans supported by computers. Solutions to these problems used to involve human professionals, who are expensive to hire or difficult to find. Despite significant advances, fully automatic systems still have much room for improvement. Recent research has involved recruiting large crowds of skilled humans (“crowdsourcing”), but crowdsourcing solutions are still restricted by the availability of those skilled human participants. With translation, for example, professional translators incur a high cost and are not always available; machine translation systems have been greatly improved recently but still can only provide passable translation; and crowdsourced translation is limited by the availability of bilingual humans.

This article describes crowdsourced monolingual translation, where monolingual translation is translation performed by monolingual people. Crowdsourced monolingual translation is a collaborative form of translation performed by two crowds of people who speak the source or the target language, respectively, with machine translation as the mediating device.

This article describes a general protocol to handle crowdsourced monolingual translation and analyzes three systems that implemented the protocol. These systems were studied in various settings and were found to supply significant improvement in quality over both machine translation and monolingual editing of machine translation output (“postediting”).

References

[1]

J. Allen. 2003. Post-editing. Edited by Harold Somers. Benjamins Translation Library, Vol. 35. Amsterdam: John Benjamins., 297--317.

[2]

M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. 2010. Soylent: A word processor with a crowd inside. In ACM (UIST’10). ACM, 313--322.

Digital Library

[3]

J. P. Bigham, C. Jayant, H. Ji, G. Little, A. Miller, R. C. Miller, R. Miller, A. Tatarowicz, B. White, S. White, and T. Yeh. 2010. VizWiz: Nearly real-time answers to visual questions. Science 16, 3 (2010), 333--342.

Digital Library

[4]

O. Buzek, P. Resnik, and B. B. Bederson. 2010. Error driven paraphrase annotation using mechanical turk. In NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk.

Digital Library

[5]

C. Callison-Burch. 2005. Linear B system description for the 2005 NIST MT evaluation exercise. Available at http://www.cs.jhu.edu/ccb/publications/linear-b-system-description-for-nist-mt-eval-2005.pdf.

[6]

H. H. Clark and C. R. Marshall. 1981. Definite reference and mutual knowledge. In A. K. Joshi, B. Webber, & I. Sag (Eds.), Elements of discourse understanding (pp. 10--63). Cambridge: Cambridge University Press.

[7]

M. Dabbadie, A. Hartley, M. King, K. J. Miller, W. M. El Hadi, A. Popescu-Belis, F. Reeder, and M. Vanni. 2002. A hands-on study of the reliability and coherence of evaluation metrics. In Proceedings of the 2002 Language Resources and Evaluation Conference (LREC’02). Available at http://mt-archive.info/LREC-2002-Dabbadie-2.pdf.

[8]

D. Danet and S. C. Herring (Eds.). 2007. The Multilingual Internet: Language, Culture, and Communication Online. Oxford University Press, New York.

Digital Library

[9]

C. Dyer, A. Lopez, J. Ganitkevitch, J. Weese, F. Ture, P. Blunsom, H. Setiawan, V. Eidelman, and P. Resnik. 2010. cdec: A decoder, alignment, and learning framework for finite-state and context-free translation models. In Proceedings of the Annual Meeting of the Association of Computational Linguistics. 7--12.

Digital Library

[10]

J. Edwards. 1994. Multilingualism. Routledge.

[11]

K. Everitt, C. Lim, O. Etzioni, J. Pool, S. Colowick, and S. Soderland. 2010. Evaluating lemmatic communication. Journal of Translation and Technical Communication Research 3 (2010), 70--84.

[12]

M. Gabsdil. 2003. Clarification in spoken dialogue systems. In Proceedings of the 2003 AAAI Spring Symposium. Workshop on Natural Language Generation in Spoken and Written Dialogue. Available at https://www.aaai.org/Papers/Symposia/Spring/2003/SS-03-06/SS03-06-006.pdf.

[13]

S. Hampshire and C. Porta Salvia. 2010. Traslation and the Internet: Evaluating the quality of free online machine translators. Quaderns: Revista de traducció 17 (2010), 197--209.

[14]

C. Hu, B. B. Bederson, and P. Resnik. 2010. Translation by iterative collaboration between monolingual users. In Proceedings of Graphics Interface 2010. Canadian Information Processing Society, Ottawa, Ontario, Canada, 39--46.

Digital Library

[15]

C. Hu, B. B. Bederson, P. Resnik, and Y. Kronrod. 2011. MonoTrans2: A new human computation system to support monolingual translation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1133--1136.

Digital Library

[16]

C. Hu, P. Resnik, Y. Kronrod, and B. Bederson. 2012. Deploying monotrans widgets in the wild. In Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems (CHI’12). ACM, New York, NY, 2935.

Digital Library

[17]

C. Hu, P. Resnik, Y. Kronrod, V. Eidelman, O. Buzek, and B. B. Bederson. 2011. The value of monolingual crowdsourcing in a real-world translation scenario: Simulation using Haitian Creole emergency SMS messages. Association for Computational Linguistics, 399--404. Available at http://www.aclweb.org/anthology/W11-2148http://aclweb.org/anthology-new/W /W11/W11-2148.pdf.

Digital Library

[18]

J. Hutchins. 2005. Current commercial machine translation systems and computer-based translation tools: system types and their uses. International Journal of Translation 17, 1--2 (2005), 5--38.

[19]

R. Hwa, P. Resnik, A. Weinberg, and O. Kolak. 2002. Evaluating translational correspondence using annotation projection. In Proceedings of the Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, PA, 392--399.

Digital Library

[20]

T. Ishida. 2006. Language grid: An infrastructure for intercultural collaboration. In Proceedings of the International Symposium on Applications and the Internet (SAINT’06). 96--100.

Digital Library

[21]

M. Kay. 1997. The proper place of men and machines in language translation. Machine Translation 12, 1/2 (1997), 3--23.

Digital Library

[22]

A. Kittur. 2010. Crowdsourcing, collaboration and creativity. XRDS Crossroads the ACM Magazine for Students 17, 2 (2010), 22--26.

Digital Library

[23]

P. Koehn. 2010. Enabling monolingual translators: Post-editing vs. options. In Proceedings of Human Language Technologies the 2010 Annual Conference of the North American Chapter of the ACL. 88, 537--545.

Digital Library

[24]

W. Lewis. 2010. Haitian Creole: How to build and ship an MT engine from scratch in 4 days, 17 hours, & 30 minutes. In EAMT 2010: Proceedings of the 14th Annual Conference of the European Association for Machine Translation (EAMT’10). Available at http://research.microsoft.com/pubs/145627/eamt-05.pdf.

[25]

G. Little, L. B. Chilton, M. Goldman, and R. C. Miller. 2010. Exploring iterative and parallel human computation processes. In Proceedings of ACM CHI 2010 Conference on Human Factors in Computing Systems. ACM, 4309--4314.

Digital Library

[26]

A. Lopez. 2008. Statistical machine translation. ACM Computing Surveys 40, 3 (2008), 1--49.

Digital Library

[27]

D. Morita and T. Ishida. 2009. Collaborative translation by monolinguals with machine translators. In Proceedings of the 14th International Conference on Intelligent User Interfaces. ACM, 361--366.

Digital Library

[28]

D. Morita and T. Ishida. 2011. Collaborative translation protocols. In The Language Grid, T. Ishida (Ed.). Springer, Berlin, 215--230.

[29]

N. S. Nise. 1995. Control Systems Engineering (2 ed.). Vol. 34. Addison-Wesley, 11--13.

Digital Library

[30]

D. W. Oard. 2003. The surprise language exercises. ACM Transactions on Asian Language Information Processing 2, 2 (2003), 79--84.

Digital Library

[31]

A. J. Quinn and B. B. Bederson. 2009. A Taxonomy of Distributed Human Computation. Human-Computer Interaction Lab Technical Report, University of Maryland. Available at http://hcil.cs.umd.edu/trs/2009-23/2009-23.pdf.

[32]

C. E. Shannon. 1948. A mathematical theory of communication. Bell System Technical Journal 27, October (1948), 623--656.

[33]

C. E. Shannon. 1951. Prediction and entropy of printed English. Bell System Technical Journal 30, 1 (1951), 50--64.

[34]

C. E. Shannon, Bell Laboratories, and M. Hill. 1950. The redundancy of English. In Proceedinigs of the 7th Conference on Cybernetics, Circular Causal and Feedback Mechanisms in Biological and Social Systems. 248--272.

[35]

M. Snover, B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas. 231, 223.

[36]

S. Soderland, C. Lim, B. Q Mausam, O. Etzioni, and J. Pool. 2009. Lemmatic Machine Translation. In Proceedings of the 12th Machine Translation Summit (MT Summit XII). 128--135.

[37]

J. Surowiecki. 2005. The Wisdom of Crowds. Vol. 75. Anchor.

Digital Library

[38]

D. Vilar, J. Xu, L. F. D’Haro, and H. Ney. 2006. Error analysis of statistical machine translation output. In Proceedings of the Language Resources and Evaluation Conference (LREC’06). 697--702.

[39]

L. Von Ahn and L. Dabbish. 2004. Labeling images with a computer game. In Proceedings of the 2004 Conference on Human Factors in Computing Systems (CHI’04). 319--326.

Digital Library

[40]

D. Yarowsky, G. Ngai, and R. Wicentowski. 2001. Inducing multilingual text analysis tools via robust projection across aligned corpora. In Proceedings of the 1st International Conference on Human Language Technology Research. Association for Computational Linguistics, 1--8.

Digital Library

[41]

O. Zaidan and C. Callison-Burch. 2011. Crowdsourcing translation: Professional quality from non-professionals. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 1220--1229.

Digital Library

Cited By

Jiménez-Crespo M(2021)The impact of crowdsourcing and online collaboration in professional translationBabel. Revue internationale de la traduction / International Journal of TranslationBabel / Revue internationale de la traduction / International Journal of TranslationBabel10.1075/babel.00230.jim67:4(395-417)Online publication date: 22-Sep-2021
https://doi.org/10.1075/babel.00230.jim
Feng WYan ZZhang HZeng KXiao YHou Y(2018)A Survey on Security, Privacy, and Trust in Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2017.27656995:4(2971-2992)Online publication date: Aug-2018
https://doi.org/10.1109/JIOT.2017.2765699
Lazar JFeng JHochheiser H(2017)Online and ubiquitous HCI researchResearch Methods in Human Computer Interaction10.1016/B978-0-12-805390-4.00014-5(411-453)Online publication date: 2017
https://doi.org/10.1016/B978-0-12-805390-4.00014-5
Show More Cited By

Index Terms

Crowdsourced Monolingual Translation
1. Human-centered computing

Recommendations

Using targeted paraphrasing and monolingual crowdsourcing to improve translation
Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction

Targeted paraphrasing is a new approach to the problem of obtaining cost-effective, reasonable quality translation, which makes use of simple and inexpensive human computations by monolingual speakers in combination with machine translation. The key ...
Translation by iterative collaboration between monolingual users
HCOMP '10: Proceedings of the ACM SIGKDD Workshop on Human Computation

In this paper we describe Monotrans, a new iterative translation process designed to leverage the massive number of online users who have minimal or no bilingual skill.
Crowdsourced monolingual translation

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer-Human Interaction

ACM Transactions on Computer-Human Interaction Volume 21, Issue 4

August 2014

141 pages

ISSN:1073-0516

EISSN:1557-7325

DOI:10.1145/2633907

Editor:
Shumin Zhai
Google

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2014

Accepted: 01 March 2014

Revised: 01 January 2014

Received: 01 April 2013

Published in TOCHI Volume 21, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
500
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jiménez-Crespo M(2021)The impact of crowdsourcing and online collaboration in professional translationBabel. Revue internationale de la traduction / International Journal of TranslationBabel / Revue internationale de la traduction / International Journal of TranslationBabel10.1075/babel.00230.jim67:4(395-417)Online publication date: 22-Sep-2021
https://doi.org/10.1075/babel.00230.jim
Feng WYan ZZhang HZeng KXiao YHou Y(2018)A Survey on Security, Privacy, and Trust in Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2017.27656995:4(2971-2992)Online publication date: Aug-2018
https://doi.org/10.1109/JIOT.2017.2765699
Lazar JFeng JHochheiser H(2017)Online and ubiquitous HCI researchResearch Methods in Human Computer Interaction10.1016/B978-0-12-805390-4.00014-5(411-453)Online publication date: 2017
https://doi.org/10.1016/B978-0-12-805390-4.00014-5
Friedman BNathan LYoo D(2016)Multi-Lifespan Information System Design in Support of Transitional Justice: Evolving Situated Design Principles for the Long(er) TermInteracting with Computers10.1093/iwc/iwv04529:1(80-96)Online publication date: 31-Jan-2016
https://doi.org/10.1093/iwc/iwv045

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents