[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-030-85896-4_14guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Data Poisoning Attacks on Crowdsourcing Learning

Published: 23 August 2021 Publication History

Abstract

Understanding and assessing the vulnerability of crowdsourcing learning against data poisoning attacks is the key to ensure the quality of classifiers trained from crowdsourced labeled data. Existing studies on data poisoning attacks only focus on exploring the vulnerability of crowdsourced label collection. In fact, instead of the quality of labels themselves, the performance of the trained classifier is a main concern in crowdsourcing learning. Nonetheless, the impact of data poisoning attacks on the final classifiers remains underexplored to date. We aim to bridge this gap. First, we formalize the problem of poisoning attacks, where the objective is to sabotage the trained classifier maximally. Second, we transform the problem into a bilevel min-max optimization problem for the typical learning-from-crowds model and design an efficient adversarial strategy. Extensive validation on real-world datasets demonstrates that our attack can significantly decrease the test accuracy of trained classifiers. We verified that the labels generated with our strategy can be transferred to attack a broad family of crowdsourcing learning models in a black-box setting, indicating its applicability and potential of being extended to the physical world.

References

[1]
Albarqouni S and Baur C AggNet: deep learning from crowds for mitosis detection in breast cancer histology images T-MI 2016 35 5 1313-1321
[2]
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of ICML, pp. 1807–1814 (2012)
[3]
Chen P, Sun H, Fang Y, and Liu X CONAN: a framework for detecting and handling collusion in crowdsourcing Inf. Sci. 2020 515 44-63
[4]
Chen, Z., et al.: Structured probabilistic end-to-end learning from crowds. In: IJCAI, pp. 1512–1518 (2020)
[5]
Fang, M., Sun, M., Li, Q., Gong, N.Z., Tian, J., Liu, J.: Data poisoning attacks and defenses to crowdsourcing systems. arXiv preprint arXiv:2102.09171 (2021)
[6]
Fang, Y., Sun, H., Chen, P., Huai, J.: On the cost complexity of crowdsourcing. In: IJCAI, pp. 1531–1537 (2018)
[7]
Gadiraju, U., Kawase, R., Dietze, S.: Understanding malicious behavior in crowdsourcing platforms: the case of online surveys. In: CHI, pp. 1631–1640 (2015)
[8]
Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: SIGKDD, pp. 64–67 (2010)
[9]
Jagabathula S, Subramanian L, and Venkataraman A Identifying unreliable and adversarial workers in crowdsourced labeling tasks J. Mach. Learn. Res. 2017 18 1 3233-3299
[10]
Jagielski, M., Oprea, A., Biggio, B.: Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In: SP, pp. 19–35 (2018)
[11]
Kleindessner, M., Awasthi, P.: Crowdsourcing with arbitrary adversaries. In: ICML, pp. 2713–2722 (2018)
[12]
Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: ICML, pp. 1885–1894 (2017)
[13]
Miao, C., Li, Q., Su, L., Huai, M., Jiang, W.: Attack under disguise: an intelligent data poisoning attack mechanism in crowdsourcing. In: WWW, pp. 13–22 (2018)
[14]
Molnar, C.: Interpretable Machine Learning. Lulu.com (2019)
[15]
Raykar VC et al. Learning from crowds JMLR 2010 11 1297-1322
[16]
Rodrigues, F.: Deep learning from crowds. In: AAAI, pp. 1611–1618 (2018)
[17]
Rodrigues F, Pereira F, and Ribeiro B Learning from multiple annotators: distinguishing good from random labelers PRL 2013 34 12 1428-1436
[18]
Tahmasebian F, Xiong L, Sotoodeh M, and Sunderam V Singhal A and Vaidya J Crowdsourcing under data poisoning attacks: a comparative study Data and Applications Security and Privacy XXXIV 2020 Cham Springer 310-332
[19]
Tong Y, Zhou Z, Zeng Y, Chen L, and Shahabi C Spatial crowdsourcing: a survey VLDB J. 2020 29 1 217-250
[20]
Wang, L., Zhou, Z.H.: Cost-saving effect of crowdsourcing learning. In: IJCAI, pp. 2111–2117 (2016)
[21]
Yuan, D., Li, G., Li, Q., Zheng, Y.: Sybil defense in crowdsourcing platforms. In: CIKM, pp. 1529–1538 (2017)
[22]
Zhao, B., Han, J.: A probabilistic model for estimating real-valued truth from conflicting sources. In: Proceedings of the QDB, vol. 1817 (2012)
[23]
Zhao, M., An, B., Gao, W., Zhang, T.: Efficient label contamination attacks against black-box learning models. In: IJCAI, pp. 3945–3951 (2017)
[24]
Zheng Y, Li G, Li Y, Shan C, and Cheng R Truth inference in crowdsourcing: is the problem solved PVLDB 2017 10 5 541-552

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Web and Big Data: 5th International Joint Conference, APWeb-WAIM 2021, Guangzhou, China, August 23–25, 2021, Proceedings, Part I
Aug 2021
514 pages
ISBN:978-3-030-85895-7
DOI:10.1007/978-3-030-85896-4

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 August 2021

Author Tags

  1. Crowdsourcing
  2. Adversarial machine learning
  3. Data poisoning attack

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media