More Web Proxy on the site http://driver.im/

research-article

Public Access

Optimal Testing for Crowd Workers

Authors:

Jonathan Bragg,

Daniel S. WeldAuthors Info & Claims

AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

Pages 966 - 974

Published: 09 May 2016 Publication History

Abstract

Requesters on crowdsourcing platforms, such as Amazon Mechanical Turk, routinely insert gold questions to verify that a worker is diligent and is providing high-quality answers. However, there is no clear understanding of when and how many gold questions to insert. Typically, requesters mix a flat 10-30% of gold questions into the task stream of every worker. This static policy is arbitrary and wastes valuable budget --- the exact percentage is often chosen with little experimentation, and, more importantly, it does not adapt to individual workers, the current mixture of spamming vs. diligent workers, or the number of tasks workers perform before quitting.

We formulate the problem of balancing between (1) testing workers to determine their accuracy and (2) actually getting work performed as a partially-observable Markov decision process (POMDP) and apply reinforcement learning to dynamically calculate the best policy. Evaluations on both synthetic data and with real Mechanical Turk workers show that our agent learns adaptive testing policies that produce up to 111% more reward than the non-adaptive policies used by most requesters. Furthermore, our method is fully automated, easy to apply, and runs mostly out of the box.

References

[1]

Crowdflower Inc. job launch checklist. https://success.crowdflower.com/hc/en-us/articles/202703195-Job-Launch-Checklist.

[2]

Crowdflower Inc. test question best practices. https://success.crowdflower.com/hc/en-us/articles/202703105-Test-Question-Best-Practices.

[3]

G. Angeli, J. Tibshirani, J. Wu, and C. D. Manning. Combining distant and partial supervision for relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), 2014.

[4]

Y. Bengio and P. Frasconi. An input output HMM architecture. In Advances in Neural Information Processing Systems (NIPS), 1995.

[5]

J. Bragg, Mausam, and D. S. Weld. Learning on the job: Optimal instruction for crowdsourcing. In ICML Workshop on Crowdsourcing and Machine Learning, 2015.

[6]

C. Callison-Burch and M. Dredze. Creating speech and language data with amazon's mechanical turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010.

Digital Library

[7]

P. Dai, C. H. Lin, Mausam, and D. S. Weld. POMDP-based control of workflows for crowdsourcing. Artif. Intell., 202:52--85, 2013.

Digital Library

[8]

P. Dai, Mausam, and D. S. Weld. Decision-theoretic control of crowd-sourced workflows. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010), 2010.

Digital Library

[9]

P. Donmez, J. G. Carbonell, and J. G. Schneider. A probabilistic framework to learn from multiple annotators with time-varying accuracy. In Proceedings of the SIAM International Conference on Data Mining (SDM 2010), 2010.

[10]

Y. Gao and A. G. Parameswaran. Finish them!: Pricing algorithms for human computation. Proceedings of the VLDB Endowment (PVLDB), 7(14):1965--1976, 2014.

Digital Library

[11]

M. R. Gormley, A. Gerber, M. Harper, and M. Dredze. Non-expert correction of automatically generated relation annotations. In NAACL Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk, 2010.

Digital Library

[12]

H. J. Jung, Y. Park, and M. Lease. Predicting next label quality: A time-series model of crowdwork. In Proceedings of the Second AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2014), 2014.

[13]

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artif. Intell., 101(1--2):99--134, 1998.

Digital Library

[14]

E. Kamar, S. Hacker, and E. Horvitz. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012), 2012.

Digital Library

[15]

A. Kobren, C. H. Tan, P. G. Ipeirotis, and E. Gabrilovich. Getting more for less: Optimized crowdsourcing with dynamic tasks and goals. In Proceedings of the 24th International Conference on World Wide Web (WWW 2015), 2015.

Digital Library

[16]

C. H. Lin, Mausam, and D. S. Weld. Dynamically switching between synergistic workflows for crowdsourcing. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2012), 2012.

Digital Library

[17]

A. Mao, Y. Chen, E. Horvitz, M. E. Schwamb, C. J. Lintott, and A. M. Smith. Volunteering Versus Work for Pay: Incentives and Tradeoffs in Crowdsourcing. In Proceedings of the First AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2013), 2013.

[18]

D. Oleson, A. Sorokin, G. P. Laughlin, V. Hester, J. Le, and L. Biewald. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. In Human Computation Workshop, page 11, 2011.

Digital Library

[19]

A. G. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, A. Ramesh, and J. Widom. Crowdscreen: Algorithms for filtering data with humans. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2012), 2012.

Digital Library

[20]

J. M. Porta, N. Vlassis, M. T. Spaan, and P. Poupart. Point-based value iteration for continuous POMDPs. J. Mach. Learn. Res., 7:2329--2367, Dec. 2006.

Digital Library

[21]

S. Rajpal, K. Goel, and Mausam. POMDP-based worker pool selection for crowdsourcing. In ICML Workshop on Crowdsourcing and Machine Learning, 2015.

[22]

T. Smith and R. Simmons. Focused real-time dynamic programming for MDPs: Squeezing more out of a heuristic. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006), 2006.

Digital Library

[23]

M. Toomim, T. Kriplean, C. Pörtner, and J. A. Landay. Utility of human-computer interactions: toward a science of preference measurement. In Proceedings of the International Conference on Human Factors in Computing Systems (CHI 2011), 2011.

Digital Library

[24]

D. Weld, Mausam, C. Lin, and J. Bragg. Artificial intelligence and collective intelligence. In T. Malone and M. Bernstein, editors, The Collective Intelligence Handbook. MIT Press, 2015.

[25]

P. Welinder, S. Branson, S. Belongie, and P. Perona. The multidimensional wisdom of crowds. In Advances in Neural Information Processing Systems (NIPS), 2010.

[26]

J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. R. Movellan. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Advances in Neural Information Processing Systems (NIPS), 2009.

[27]

M. Yin and Y. Chen. Bonus or not? learn to reward in crowdsourcing. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), 2015.

Digital Library

[28]

C. Zhang, F. Niu, C. Ré, and J. W. Shavlik. Big data versus the crowd: Looking for relationships in all the right places. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012), 2012.

Digital Library

Cited By

Galhotra SFirmani DSaha BSrivastava DIves ZBonifati AEl Abbadi A(2022)Hierarchical Entity Resolution using an OracleProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526147(414-428)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3526147
Ashktorab ZDugan CJohnson JSharma ATorres DLange IHoover BLudwig HChen BBaracaldo NGeyer WPan QHammond TVerbert KParra DKnijnenburg BO'Donovan JTeale P(2021)The Design and Development of a Game to Study Backdoor Poisoning Attacks: The Backdoor GameProceedings of the 26th International Conference on Intelligent User Interfaces10.1145/3397481.3450647(423-433)Online publication date: 14-Apr-2021
https://dl.acm.org/doi/10.1145/3397481.3450647
Goel NFaltings BConitzer VHadfield GVallor S(2019)Crowdsourcing with Fairness, Diversity and Budget ConstraintsProceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3306618.3314282(297-304)Online publication date: 27-Jan-2019
https://dl.acm.org/doi/10.1145/3306618.3314282
Show More Cited By

Index Terms

Optimal Testing for Crowd Workers
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Intelligent agents

Recommendations

Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments

The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Make Hay While the Crowd Shines: Towards Efficient Crowdsourcing on the Web
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web

Within the scope of this PhD proposal, we set out to investigate two pivotal aspects that influence the effectiveness of crowdsourcing: (i) microtask design, and (ii) workers behavior. Leveraging the dynamics of tasks that are crowdsourced on the one ...
Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection
Abstract
The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

May 2016

1580 pages

ISBN:9781450342391

General Chairs:
Catholijn M. Jonker
TU Delft, Netherlands
,
Stacy Marsella
University of Southern California, USA
,
Program Chairs:
John Thangarajah
RMIT University. Australia
,
Karl Tuyls
University of Liverpool, UK

Sponsors

IFAAMAS

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 09 May 2016

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Bloomberg
Google
NSF
ONR

Conference

AAMAS '16

Sponsor:

AAMAS '16: International Conference on Agents and Multiagent Systems

May 9 - 13, 2016

Singapore, Singapore

Acceptance Rates

AAMAS '16 Paper Acceptance Rate 137 of 550 submissions, 25%;

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
281
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)4

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Galhotra SFirmani DSaha BSrivastava DIves ZBonifati AEl Abbadi A(2022)Hierarchical Entity Resolution using an OracleProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526147(414-428)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3526147
Ashktorab ZDugan CJohnson JSharma ATorres DLange IHoover BLudwig HChen BBaracaldo NGeyer WPan QHammond TVerbert KParra DKnijnenburg BO'Donovan JTeale P(2021)The Design and Development of a Game to Study Backdoor Poisoning Attacks: The Backdoor GameProceedings of the 26th International Conference on Intelligent User Interfaces10.1145/3397481.3450647(423-433)Online publication date: 14-Apr-2021
https://dl.acm.org/doi/10.1145/3397481.3450647
Goel NFaltings BConitzer VHadfield GVallor S(2019)Crowdsourcing with Fairness, Diversity and Budget ConstraintsProceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3306618.3314282(297-304)Online publication date: 27-Jan-2019
https://dl.acm.org/doi/10.1145/3306618.3314282
Niu XQin SVines JWong RLu H(2019)Key Crowdsourcing Technologies for Product Design and DevelopmentInternational Journal of Automation and Computing10.1007/s11633-018-1138-716:1(1-15)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1007/s11633-018-1138-7
Qiu CSquicciarini AKhare DCarminati BCaverlee JAndre EKoenig SDastani MSukthankar G(2018)CrowdEvalProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237922(1486-1494)Online publication date: 9-Jul-2018
https://dl.acm.org/doi/10.5555/3237383.3237922
Bragg JMausam Weld DBaudisch PSchmidt AWilson A(2018)SproutProceedings of the 31st Annual ACM Symposium on User Interface Software and Technology10.1145/3242587.3242598(165-176)Online publication date: 11-Oct-2018
https://dl.acm.org/doi/10.1145/3242587.3242598
Galhotra SFirmani DSaha BSrivastava DDas GJermaine CBernstein P(2018)Robust Entity Resolution using Random GraphsProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3183755(3-18)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3183713.3183755
Krivosheev ECasati FBenatallah BChampin PGandon FMédini LLalmas MIpeirotis P(2018)Crowd-based Multi-Predicate Screening of Papers in Literature ReviewsProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186036(55-64)Online publication date: 10-Apr-2018
https://dl.acm.org/doi/10.1145/3178876.3186036

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten