Truth inference in crowdsourcing: Is the problem solved?

Y Zheng, G Li, Y Li, C Shan, R Cheng - Proceedings of the VLDB …, 2017 - dl.acm.org
Proceedings of the VLDB Endowment, 2017dl.acm.org
Crowdsourcing has emerged as a novel problem-solving paradigm, which facilitates
addressing problems that are hard for computers, eg, entity resolution and sentiment
analysis. However, due to the openness of crowdsourcing, workers may yield low-quality
answers, and a redundancy-based method is widely employed, which first assigns each task
to multiple workers and then infers the correct answer (called truth) for the task based on the
answers of the assigned workers. A fundamental problem in this method is Truth Inference …
Crowdsourcing has emerged as a novel problem-solving paradigm, which facilitates addressing problems that are hard for computers, e.g., entity resolution and sentiment analysis. However, due to the openness of crowdsourcing, workers may yield low-quality answers, and a redundancy-based method is widely employed, which first assigns each task to multiple workers and then infers the correct answer (called truth) for the task based on the answers of the assigned workers. A fundamental problem in this method is Truth Inference, which decides how to effectively infer the truth. Recently, the database community and data mining community independently study this problem and propose various algorithms. However, these algorithms are not compared extensively under the same framework and it is hard for practitioners to select appropriate algorithms. To alleviate this problem, we provide a detailed survey on 17 existing algorithms and perform a comprehensive evaluation using 5 real datasets. We make all codes and datasets public for future research. Through experiments we find that existing algorithms are not stable across different datasets and there is no algorithm that outperforms others consistently. We believe that the truth inference problem is not fully solved, and identify the limitations of existing algorithms and point out promising research directions.
ACM Digital Library