[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
review-article

Crowdsourced Data Management: : Industry and Academic Perspectives

Published: 01 December 2015 Publication History

Abstract

Crowdsourcing and human computation enable organizations to accomplish tasks that are currently not possible for fully automated techniques to complete, or require more flexibility and scalability than traditional employment relationships can facilitate. In the area of data processing, companies have benefited from crowd workers on platforms such as Amazon’s Mechanical Turk or Upwork to complete tasks as varied as content moderation, web content extraction, entity resolution, and video/audio/image processing. Several academic researchers from diverse areas ranging from the social sciences to computer science have embraced crowdsourcing as a research area, resulting in algorithms and systems that improve crowd work quality, latency, or cost. Given the relative nascence of the field, the academic and the practitioner communities have largely operated independently of each other for the past decade, rarely exchanging techniques and experiences. In this monograph, we aim to narrow the gap between academics and practitioners. On the academic side, we summarize the state of the art in crowd-powered algorithms and system design tailored to large-scale data processing. On the industry side, we survey 13 industry users (e.g., Google, Facebook, Microsoft) and 4 marketplace providers of crowd work (e.g., CrowdFlower, Upwork) to identify how hundreds of engineers and tens of million dollars are invested in various crowdsourcing solutions. Through the monograph, we hope to simultaneously introduce academics to real problems that practitioners encounter every day, and provide a survey of the state of the art for practitioners to incorporate into their designs. Through our surveys, we also highlight the fact that crowdpowered data processing is a large and growing field. Over the next decade, we believe that most technical organizations will in some way benefit from crowd work, and hope that this monograph can help guide the effective adoption of crowdsourcing across these organizations.

Cited By

View all
  • (2024)Quantifying and Leveraging Uncertain and Imprecise Answers in Multiple Choice Questionnaires for CrowdsourcingProceedings of the 35th Conference on l'Interaction Humain-Machine10.1145/3649792.3649806(1-10)Online publication date: 25-Mar-2024
  • (2022)Information Resilience: the nexus of responsible and agile approaches to information useThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00720-231:5(1059-1084)Online publication date: 16-Jan-2022
  • (2021)Scalable Database Normalization Powered by the CrowdProceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)10.1145/3430984.3431032(213-217)Online publication date: 2-Jan-2021
  • Show More Cited By

Index Terms

  1. Crowdsourced Data Management: Industry and Academic Perspectives
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Foundations and Trends in Databases
      Foundations and Trends in Databases  Volume 6, Issue 1-2
      Dec 2015
      165 pages
      ISSN:1931-7883
      EISSN:1931-7891
      Issue’s Table of Contents

      Publisher

      Now Publishers Inc.

      Hanover, MA, United States

      Publication History

      Published: 01 December 2015

      Author Tag

      1. Crowdsourcing

      Qualifiers

      • Review-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 01 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Quantifying and Leveraging Uncertain and Imprecise Answers in Multiple Choice Questionnaires for CrowdsourcingProceedings of the 35th Conference on l'Interaction Humain-Machine10.1145/3649792.3649806(1-10)Online publication date: 25-Mar-2024
      • (2022)Information Resilience: the nexus of responsible and agile approaches to information useThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00720-231:5(1059-1084)Online publication date: 16-Jan-2022
      • (2021)Scalable Database Normalization Powered by the CrowdProceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)10.1145/3430984.3431032(213-217)Online publication date: 2-Jan-2021
      • (2021)Investigating the Accessibility of Crowdwork Tasks on Mechanical TurkProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445291(1-14)Online publication date: 6-May-2021
      • (2021)Efficient and flexible crowdsourcing of specialized tasks with precedence constraintsIEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications10.1109/INFOCOM.2016.7524615(1-9)Online publication date: 10-Mar-2021
      • (2020)Prediction of Hourly Earnings and Completion Time on a Crowdsourcing PlatformProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403369(3172-3182)Online publication date: 23-Aug-2020
      • (2020)An Empirical Survey on Crowdsourcing-Based Data Management TechniquesProceedings of the International Conference on Computing Advancements10.1145/3377049.3377106(1-7)Online publication date: 10-Jan-2020
      • (2020)How Impactful Is Presentation in Email? The Effect of Avatars and SignaturesACM Transactions on Interactive Intelligent Systems10.1145/334564110:3(1-26)Online publication date: 13-Nov-2020
      • (2019)Hybrid Crowd-Machine Wrapper InferenceACM Transactions on Knowledge Discovery from Data10.1145/334472013:5(1-43)Online publication date: 24-Sep-2019
      • (2019)Deadline-Aware Fair Scheduling for Multi-Tenant Crowd-Powered SystemsACM Transactions on Social Computing10.1145/33010032:1(1-29)Online publication date: 21-Feb-2019
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media