[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Crowdsourced Monolingual Translation

Published: 01 August 2014 Publication History

Abstract

An enormous potential exists for solving certain classes of computational problems through rich collaboration among crowds of humans supported by computers. Solutions to these problems used to involve human professionals, who are expensive to hire or difficult to find. Despite significant advances, fully automatic systems still have much room for improvement. Recent research has involved recruiting large crowds of skilled humans (“crowdsourcing”), but crowdsourcing solutions are still restricted by the availability of those skilled human participants. With translation, for example, professional translators incur a high cost and are not always available; machine translation systems have been greatly improved recently but still can only provide passable translation; and crowdsourced translation is limited by the availability of bilingual humans.
This article describes crowdsourced monolingual translation, where monolingual translation is translation performed by monolingual people. Crowdsourced monolingual translation is a collaborative form of translation performed by two crowds of people who speak the source or the target language, respectively, with machine translation as the mediating device.
This article describes a general protocol to handle crowdsourced monolingual translation and analyzes three systems that implemented the protocol. These systems were studied in various settings and were found to supply significant improvement in quality over both machine translation and monolingual editing of machine translation output (“postediting”).

References

[1]
J. Allen. 2003. Post-editing. Edited by Harold Somers. Benjamins Translation Library, Vol. 35. Amsterdam: John Benjamins., 297--317.
[2]
M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. 2010. Soylent: A word processor with a crowd inside. In ACM (UIST’10). ACM, 313--322.
[3]
J. P. Bigham, C. Jayant, H. Ji, G. Little, A. Miller, R. C. Miller, R. Miller, A. Tatarowicz, B. White, S. White, and T. Yeh. 2010. VizWiz: Nearly real-time answers to visual questions. Science 16, 3 (2010), 333--342.
[4]
O. Buzek, P. Resnik, and B. B. Bederson. 2010. Error driven paraphrase annotation using mechanical turk. In NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk.
[5]
C. Callison-Burch. 2005. Linear B system description for the 2005 NIST MT evaluation exercise. Available at http://www.cs.jhu.edu/ccb/publications/linear-b-system-description-for-nist-mt-eval-2005.pdf.
[6]
H. H. Clark and C. R. Marshall. 1981. Definite reference and mutual knowledge. In A. K. Joshi, B. Webber, & I. Sag (Eds.), Elements of discourse understanding (pp. 10--63). Cambridge: Cambridge University Press.
[7]
M. Dabbadie, A. Hartley, M. King, K. J. Miller, W. M. El Hadi, A. Popescu-Belis, F. Reeder, and M. Vanni. 2002. A hands-on study of the reliability and coherence of evaluation metrics. In Proceedings of the 2002 Language Resources and Evaluation Conference (LREC’02). Available at http://mt-archive.info/LREC-2002-Dabbadie-2.pdf.
[8]
D. Danet and S. C. Herring (Eds.). 2007. The Multilingual Internet: Language, Culture, and Communication Online. Oxford University Press, New York.
[9]
C. Dyer, A. Lopez, J. Ganitkevitch, J. Weese, F. Ture, P. Blunsom, H. Setiawan, V. Eidelman, and P. Resnik. 2010. cdec: A decoder, alignment, and learning framework for finite-state and context-free translation models. In Proceedings of the Annual Meeting of the Association of Computational Linguistics. 7--12.
[10]
J. Edwards. 1994. Multilingualism. Routledge.
[11]
K. Everitt, C. Lim, O. Etzioni, J. Pool, S. Colowick, and S. Soderland. 2010. Evaluating lemmatic communication. Journal of Translation and Technical Communication Research 3 (2010), 70--84.
[12]
M. Gabsdil. 2003. Clarification in spoken dialogue systems. In Proceedings of the 2003 AAAI Spring Symposium. Workshop on Natural Language Generation in Spoken and Written Dialogue. Available at https://www.aaai.org/Papers/Symposia/Spring/2003/SS-03-06/SS03-06-006.pdf.
[13]
S. Hampshire and C. Porta Salvia. 2010. Traslation and the Internet: Evaluating the quality of free online machine translators. Quaderns: Revista de traducció 17 (2010), 197--209.
[14]
C. Hu, B. B. Bederson, and P. Resnik. 2010. Translation by iterative collaboration between monolingual users. In Proceedings of Graphics Interface 2010. Canadian Information Processing Society, Ottawa, Ontario, Canada, 39--46.
[15]
C. Hu, B. B. Bederson, P. Resnik, and Y. Kronrod. 2011. MonoTrans2: A new human computation system to support monolingual translation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1133--1136.
[16]
C. Hu, P. Resnik, Y. Kronrod, and B. Bederson. 2012. Deploying monotrans widgets in the wild. In Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems (CHI’12). ACM, New York, NY, 2935.
[17]
C. Hu, P. Resnik, Y. Kronrod, V. Eidelman, O. Buzek, and B. B. Bederson. 2011. The value of monolingual crowdsourcing in a real-world translation scenario: Simulation using Haitian Creole emergency SMS messages. Association for Computational Linguistics, 399--404. Available at http://www.aclweb.org/anthology/W11-2148http://aclweb.org/anthology-new/W /W11/W11-2148.pdf.
[18]
J. Hutchins. 2005. Current commercial machine translation systems and computer-based translation tools: system types and their uses. International Journal of Translation 17, 1--2 (2005), 5--38.
[19]
R. Hwa, P. Resnik, A. Weinberg, and O. Kolak. 2002. Evaluating translational correspondence using annotation projection. In Proceedings of the Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, PA, 392--399.
[20]
T. Ishida. 2006. Language grid: An infrastructure for intercultural collaboration. In Proceedings of the International Symposium on Applications and the Internet (SAINT’06). 96--100.
[21]
M. Kay. 1997. The proper place of men and machines in language translation. Machine Translation 12, 1/2 (1997), 3--23.
[22]
A. Kittur. 2010. Crowdsourcing, collaboration and creativity. XRDS Crossroads the ACM Magazine for Students 17, 2 (2010), 22--26.
[23]
P. Koehn. 2010. Enabling monolingual translators: Post-editing vs. options. In Proceedings of Human Language Technologies the 2010 Annual Conference of the North American Chapter of the ACL. 88, 537--545.
[24]
W. Lewis. 2010. Haitian Creole: How to build and ship an MT engine from scratch in 4 days, 17 hours, & 30 minutes. In EAMT 2010: Proceedings of the 14th Annual Conference of the European Association for Machine Translation (EAMT’10). Available at http://research.microsoft.com/pubs/145627/eamt-05.pdf.
[25]
G. Little, L. B. Chilton, M. Goldman, and R. C. Miller. 2010. Exploring iterative and parallel human computation processes. In Proceedings of ACM CHI 2010 Conference on Human Factors in Computing Systems. ACM, 4309--4314.
[26]
A. Lopez. 2008. Statistical machine translation. ACM Computing Surveys 40, 3 (2008), 1--49.
[27]
D. Morita and T. Ishida. 2009. Collaborative translation by monolinguals with machine translators. In Proceedings of the 14th International Conference on Intelligent User Interfaces. ACM, 361--366.
[28]
D. Morita and T. Ishida. 2011. Collaborative translation protocols. In The Language Grid, T. Ishida (Ed.). Springer, Berlin, 215--230.
[29]
N. S. Nise. 1995. Control Systems Engineering (2 ed.). Vol. 34. Addison-Wesley, 11--13.
[30]
D. W. Oard. 2003. The surprise language exercises. ACM Transactions on Asian Language Information Processing 2, 2 (2003), 79--84.
[31]
A. J. Quinn and B. B. Bederson. 2009. A Taxonomy of Distributed Human Computation. Human-Computer Interaction Lab Technical Report, University of Maryland. Available at http://hcil.cs.umd.edu/trs/2009-23/2009-23.pdf.
[32]
C. E. Shannon. 1948. A mathematical theory of communication. Bell System Technical Journal 27, October (1948), 623--656.
[33]
C. E. Shannon. 1951. Prediction and entropy of printed English. Bell System Technical Journal 30, 1 (1951), 50--64.
[34]
C. E. Shannon, Bell Laboratories, and M. Hill. 1950. The redundancy of English. In Proceedinigs of the 7th Conference on Cybernetics, Circular Causal and Feedback Mechanisms in Biological and Social Systems. 248--272.
[35]
M. Snover, B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas. 231, 223.
[36]
S. Soderland, C. Lim, B. Q Mausam, O. Etzioni, and J. Pool. 2009. Lemmatic Machine Translation. In Proceedings of the 12th Machine Translation Summit (MT Summit XII). 128--135.
[37]
J. Surowiecki. 2005. The Wisdom of Crowds. Vol. 75. Anchor.
[38]
D. Vilar, J. Xu, L. F. D’Haro, and H. Ney. 2006. Error analysis of statistical machine translation output. In Proceedings of the Language Resources and Evaluation Conference (LREC’06). 697--702.
[39]
L. Von Ahn and L. Dabbish. 2004. Labeling images with a computer game. In Proceedings of the 2004 Conference on Human Factors in Computing Systems (CHI’04). 319--326.
[40]
D. Yarowsky, G. Ngai, and R. Wicentowski. 2001. Inducing multilingual text analysis tools via robust projection across aligned corpora. In Proceedings of the 1st International Conference on Human Language Technology Research. Association for Computational Linguistics, 1--8.
[41]
O. Zaidan and C. Callison-Burch. 2011. Crowdsourcing translation: Professional quality from non-professionals. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 1220--1229.

Cited By

View all
  • (2021)The impact of crowdsourcing and online collaboration in professional translationBabel. Revue internationale de la traduction / International Journal of TranslationBabel / Revue internationale de la traduction / International Journal of TranslationBabel10.1075/babel.00230.jim67:4(395-417)Online publication date: 22-Sep-2021
  • (2018)A Survey on Security, Privacy, and Trust in Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2017.27656995:4(2971-2992)Online publication date: Aug-2018
  • (2017)Online and ubiquitous HCI researchResearch Methods in Human Computer Interaction10.1016/B978-0-12-805390-4.00014-5(411-453)Online publication date: 2017
  • Show More Cited By

Index Terms

  1. Crowdsourced Monolingual Translation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Computer-Human Interaction
    ACM Transactions on Computer-Human Interaction  Volume 21, Issue 4
    August 2014
    141 pages
    ISSN:1073-0516
    EISSN:1557-7325
    DOI:10.1145/2633907
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 2014
    Accepted: 01 March 2014
    Revised: 01 January 2014
    Received: 01 April 2013
    Published in TOCHI Volume 21, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Crowdsourcing
    2. human computation
    3. translation

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 19 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)The impact of crowdsourcing and online collaboration in professional translationBabel. Revue internationale de la traduction / International Journal of TranslationBabel / Revue internationale de la traduction / International Journal of TranslationBabel10.1075/babel.00230.jim67:4(395-417)Online publication date: 22-Sep-2021
    • (2018)A Survey on Security, Privacy, and Trust in Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2017.27656995:4(2971-2992)Online publication date: Aug-2018
    • (2017)Online and ubiquitous HCI researchResearch Methods in Human Computer Interaction10.1016/B978-0-12-805390-4.00014-5(411-453)Online publication date: 2017
    • (2016)Multi-Lifespan Information System Design in Support of Transitional Justice: Evolving Situated Design Principles for the Long(er) TermInteracting with Computers10.1093/iwc/iwv04529:1(80-96)Online publication date: 31-Jan-2016

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media