More Web Proxy on the site http://driver.im/

article

Securing Machine Learning Against Data Poisoning Attacks

Author:

Nasser AllheeibAuthors Info & Claims

International Journal of Data Warehousing and Mining (IJDWM), Volume 20, Issue 1

Pages 1 - 21

https://doi.org/10.4018/IJDWM.358335

Published: 16 November 2024 Publication History

Abstract

The emergence of intelligent networks has revolutionized the use of machine learning (ML), allowing it to be applied in various domains of human life. This literature review paper provides in-depth analysis of the existing research on data poisoning attacks and examines how intelligent networks can mitigate these threats. Specifically, the author explores how malicious users inject fake training data into adversarial networks, a technique known as a data poisoning attack, which can severely compromise the model's integrity. Through a comparative evaluation of the attack strategies and defense mechanisms, such as robust optimization and model-based detection, the author assesses the strengths and limitations of current defenses. Real-world applications are discussed, including the use of these networks in cybersecurity, healthcare, and smart city systems. The author concludes by outlining the challenges and future directions in developing more effective defense strategies to detect and mitigate data poisoning attacks in real time, ensuring the security and privacy of intelligent networks.

References

[1]

Adi, Y., Baum, C., Cisse, M., Pinkas, B., & Keshet, J. (2018). Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In SEC ’18: Proceedings of the 27th USENIX Security Symposium (pp. 1615–1631). USENIX Association., https://www.usenix.org/conference/usenixsecurity18/presentation/adi.

[2]

Aghakhani, H., Eisenhofer, T., Schönherr, L., Kolossa, D., Holz, T., Krüegel, C., & Vigna, G. (2020). VenoMave: Targeted poisoning against speech recognition. arXiv:2010.10682 [cs.SD]. https://doi.org//arXiv.2010.1068210.48550

[3]

Aghakhani, H., Meng, D., Wang, Y.-X., Krüegel, C., & Vigna, G. (2020). Bullseye polytope: A scalable clean-label poisoning attack with improved transferability. arXiv (preprint) arXiv:2005.00191 [cs.LG]. https://doi.org//arXiv.2005.0019110.48550

[4]

Ahmed, I. M., & Kashmoola, M. Y. (2021). Threats on machine learning technique by data poisoning attack: A survey. In N. Abdullah, S. Manickam, & M. Anbar (Eds.), Advances in Cyber Security. ACeS 2021 (pp. 586–600). Communications in Computer and Information Science, vol 1487. Springer, Singapore.

[5]

Al Ghamdi, M. A., Bhatti, M. S., Saeed, A., Gillani, Z., & Almotiri, S. H. (2024). A fusion of BERT, machine learning, and manual approach for fake news detection. Multimedia Tools and Applications, 83(10), 30095–30112.

[6]

Almotiri, S. H., Nadeem, M., Al Ghamdi, M. A., & Khan, R. A. (2023). Analytic review of healthcare software by using quantum computing security techniques. International Journal of Fuzzy Logic and Intelligent Systems, 23(3), 336–352.

[7]

Alsuwat, E., Alsuwat, H., Valtorta, M., & Farkas, C. (2019). Adversarial data poisoning attacks against the PC learning algorithm. International Journal of General Systems, 49(1), 3–31.

[8]

Alzahrani, S., Alsuwat, H., & Alsuwat, E. (2024). Evaluating the efficacy of latent variables in mitigating data poisoning attacks in the context of Bayesian networks: An empirical study. Computer Modeling in Engineering & Sciences, 139(2), 1635–1654.

[9]

Andreina, S., Marson, G. A., Möllering, H., & Karame, G. (2020). BaFFle: Backdoor detection via feedback-based federated learning. arXiv (preprint). arXiv:2011.02167. https://doi.org//arXiv.2011.0216710.48550

[10]

Ashcraft, C., & Karra, K. (2021). Poisoning deep reinforcement learning agents with in-distribution triggers. arXiv (preprint). arXiv:2106.07798. https://doi.org//arXiv.2106.0779810.48550

[11]

Chen, J., Zhang, X., Zhang, R., Wang, C., & Liu, L. (2021). De-pois: An attack-agnostic defense against data poisoning attacks. IEEE Transactions on Information Forensics and Security, 16, 3412–3425.

[12]

Comiter, M. (2019). Attacking artificial intelligence: AI’s security vulnerability and what policymakers can do about it. Belfer Center for Science and International Affairs, Harvard Kennedy School.

[13]

Deng, Y., Zhang, T., Lou, G., Zheng, X., Jin, J., & Han, Q.-L. (2021). Deep learning-based autonomous driving systems: A survey of attacks and defenses. IEEE Transactions on Industrial Informatics, 17(12), 7897–7912.

[14]

Dineen, J., Ahsan-Ul Haque, A. S. M., & Bielskas, M. (2021). Reinforcement learning for data poisoning on graph neural networks. In Thomson, R., Hussain, M. N., Dancy, C., & Pyke, A. (Eds.), Lecture Notes in Computer Science: Vol. 12720. Social, cultural, and behavioral modeling. SBP-BRiMS 2021 (pp. 141–150). Springer.

Digital Library

[15]

Dixit, P., & Silakari, S. (2021). Deep learning algorithms for cybersecurity applications: A technological and status review. Computer Science Review, 39, 100317.

[16]

DokuR.RawatD. B. (2021). Mitigating data poisoning attacks on a federated learning-edge computing network. In Proceedings of the 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6. IEEE. 10.1109/CCNC49032.2021.9369581

[17]

EykholtK.EvtimovI.FernandesE.LiB.RahmatiA.XiaoC.PrakashA.KohnoT.SongD. (2018). Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1625–1634. IEEE. 10.1109/CVPR.2018.00175

[18]

Goldblum, M., Tsipras, D., Xie, C., Chen, X., Schwarzschild, A., Song, D., Madry, A., Li, B., & Goldstein, T. (2023). Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(02), 1563–1580. 35333711.

[19]

Gu, T., Dolan-Gavitt, B., & Garg, S. (2017). BadNets: Identifying vulnerabilities in the machine learning model supply chain. arXiv (preprint). arXiv:1708.06733 [cs.CR]. https://doi.org//arXiv.1708.0673310.48550

[20]

HitajB.AtenieseG.Perez-CruzF. (2017). Deep models under the GAN: Information leakage from collaborative deep learning. In CCS ’17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618. Association for Computing Machinery. 10.1145/3133956.3134012

Digital Library

[21]

Horkoff, J. (2019). Non-functional requirements for machine learning: Challenges and new directions. In 2019 IEEE 27th International Requirements Engineering Conference (RE), pp. 386–391. IEEE. 10.1109/RE.2019.00050

[22]

Huang, H., Mu, J., Gong, N. Z., Li, Q., Liu, B., & Xu, M. (2021). Data poisoning attacks to deep learning based recommender systems. arXiv (preprint). arXiv:2101.02644. 10.14722/ndss.2021.24525

[23]

LiD.WongW. E.WangW.YaoY.ChauM. (2021). Detection and mitigation of label-flipping attacks in federated learning systems with KPCA and K-means. In Proceedings of the 2021 8th International Conference on Dependable Systems and Their Applications (DSA), pp. 551–559. IEEE. 10.1109/DSA52907.2021.00081

[24]

Liu, X., Xie, L., Wang, Y., Zou, J., Xiong, J., Ying, Z., & Vasilakos, A. V. (2020). Privacy and security issues in deep learning: A survey. IEEE Access : Practical Innovations, Open Solutions, 9, 4566–4593.

[25]

Liu, Y., Ma, S., Aafer, Y., Lee, W.-C., Zhai, J., Wang, W., & Zhang, X. (2018). Trojaning attack on neural networks. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium, pp. 1–15. 10.14722/ndss.2018.23291

[26]

Long, Y., Bindschaedler, V., Wang, L., Bu, D., Wang, X., Tang, H., Gunter, C. A., & Chen, K. (2018). Understanding membership inferences on well-generalized learning models. arXiv:1802.04889 [cs.CR]. https://doi.org//arXiv.1802.0488910.48550

[27]

LowdD.MeekC. (2005). Good word attacks on statistical spam filters. In Proceedings of the Second Conference on Email and Anti-Spam (CEAS), pp. 1–8.

[28]

Ma, X., Li, B., Wang, Y., Erfani, S. M., Wijewickrema, S., Schoenebeck, G., Song, D., Houle, M. E., & Bailey, J. (2018). Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv (preprint). arXiv:1801.02613 [cs.LG]. https://doi.org//arXiv.1801.0261310.48550

[29]

Mehra, A., Kailkhura, B., Chen, P.-Y., & Hamm, J. (2021). Understanding the limits of unsupervised domain adaptation via data poisoning. In NIPS ’21: Proceedings of the 35th International Conference on Neural Information Processing Systems, 1327, pp. 17347–17359. https://proceedings.neurips.cc/paper/2021/file/90cc440b1b8caa520c562ac4e4bbcb51-Paper.pdf

[30]

MelisL.SongC.De ChristofaroE.ShmatikovV. (2019). Exploiting unintended feature leakage in collaborative learning. In Proceedings of the 40th IEEE Symposium on Security & Privacy (S&P), pp. 691–706. arXiv:1805.04049 [cs.CR]. 10.1109/SP.2019.00029

[31]

Miller, D. J., Xiang, Z., & Kesidis, G. (2020). Adversarial learning targeting deep neural network classification: A comprehensive review of defenses against attacks. Proceedings of the IEEE, 108(3), 402–433.

[32]

NasrM.ShokriR.HoumansadrA. (2019). Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 739–753. IEEE. 10.1109/SP.2019.00065

[33]

Nweke, L. O. (2017). Using the CIA and AAA models to explain cybersecurity activities. PM World Journal, 6(12), 1–3. https://pmworldlibrary.net/wp-content/uploads/2017/05/171126-Nweke-Using-CIA-and-AAA-Models-to-explain-Cybersecurity.pdf

[34]

Qiu, W. (2022). A survey on poisoning attacks against supervised machine learning. arXiv (preprint). arXiv:2202.02510 [cs.CR]. https://doi.org//arXiv.2202.0251010.48550

[35]

Ramirez, M. A., Kim, S.-K., Hamadi, H. A., Damiani, E., Byon, Y.-J., Kim, T.-Y., Cho, C.-S., & Yeun, C. Y. (2022). Poisoning attacks and defenses on artificial intelligence: A survey. arXiv (preprint). arXiv:2202.10276 [cs.CR]. https://doi.org//arXiv.2202.1027610.48550

[36]

Rawal, A., Rawat, D., & Sadler, B. M. (2021). Recent advances in adversarial machine learning: Status, challenges and perspectives. Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III, 117462Q. SPIE.

[37]

Razmi, F., & Xiong, L. (2021). Classification auto-encoder based detector against diverse data poisoning attacks. arXiv (preprint). arXiv:2108.04206 [cs.LG]. https://doi.org//arXiv.2108.0420610.48550

[38]

Shan, S., Bhagoji, A. N., Zheng, H., & Zhao, B. Y. (2021). Poison tracking: Traceback of data poisoning attacks in neural networks. In Proceedings of the 31st USENIX Security Symposium. USENIX. https://researchr.org/publication/ShanB0Z22

[39]

ShokriR.StronatiM.SongC.ShmatikovV. (2017). Membership inference attacks against machine learning models. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE. 10.1109/SP.2017.41

[40]

Sun, G., Cong, Y., Dong, J., Wang, Q., Lyu, L., & Liu, J. (2021). Data poisoning attacks on federated machine learning. IEEE Internet of Things Journal, 9(13), 11365–11375.

[41]

SunZ.DuX.SongF.NiM.LiL. (2022). CoProtector: Protect open-source code against unauthorized training usage with data poisoning. In WWW ’22: Proceedings of the ACM Web Conference 2022, pp. 652–660. 10.1145/3485447.3512225

[42]

Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., & Ristenpart, T. (2016). Stealing machine learning models via prediction APIs. In Proceedings of the 25th USENIX Security Symposium (pp. 601–618). USENIX Association., https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf.

Digital Library

[43]

Uprety, A., & Rawat, D. B. (2021). Mitigating poisoning attack in federated learning. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 01–07. IEEE. 10.1109/SSCI50451.2021.9659839

[44]

Usynin, D., Ziller, A., Makowski, M., Braren, R., Rueckert, D., Glocker, B., Kaissis, G., & Passerat-Palmbach, J. (2021). Adversarial interference and its mitigations in privacy-preserving collaborative machine learning. Nature Machine Intelligence, 3(9), 749–758.

[45]

Verde, L., Marulli, F., & Marrone, S. (2021). Exploring the impact of data poisoning attacks on machine learning model reliability. Procedia Computer Science, 192, 2624–2632.

Digital Library

[46]

Vinyals, O., & et. al., (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350–354. 31666705.

[47]

WangB.GongN. Z. (2018). Stealing hyperparameters in machine learning. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), pp. 36–52. IEEE. 10.1109/SP.2018.00038

[48]

Wang, C., Chen, J., Yang, Y., Ma, X., & Liu, J. (2022). Poisoning attacks and countermeasures in intelligent networks: Status quo and prospects. Digital Communications and Networks, 8(2), 225–234.

[49]

Wang, Z., Kang, Q., Zhang, X., & Hu, Q. (2022). Defense strategies toward model poisoning attacks in federated learning: A survey. arXiv (preprint). arXiv:2202.06414 [cs.CR]. /arXiv.2202.0641410.1109/WCNC51071.2022.9771619

[50]

Wazid, M., Das, A. K., Chamola, V., & Park, Y. (2022). Uniting cyber security and machine learning: Advantages, challenges and future research. ICT Express, 8(3), 313–321.

[51]

Xiong, J., Bi, R., Zhao, M., Guo, J., & Yang, Q. (2020). Edge-assisted privacy-preserving raw data sharing framework for connected autonomous vehicles. IEEE Wireless Communications, 27(3), 24–30.

[52]

Zhang, J., Pan, L., Han, Q.-L., Chen, C., Wen, S., & Xiang, Y. (2021). Deep learning based attack detection for cyber-physical system cybersecurity: A survey. IEEE/CAA Journal of Automatica Sinica, 9(3), 377–391. 10.1109/JAS.2021.1004261

[53]

Zhu, C., Huang, W. R., Shafahi, A., Li, H., Taylor, G., Studer, C., & Goldstein, T. (2019). Transferable clean-label poisoning attacks on deep neural nets. arXiv (preprint). arXiv:1905.05897 [stat.ML]. https://doi.org//arXiv.1905.0589710.48550

Index Terms

Securing Machine Learning Against Data Poisoning Attacks

Index terms have been assigned to the content through auto-classification.

Recommendations

Subpopulation Data Poisoning Attacks
CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security

Machine learning systems are deployed in critical settings, but they might fail in unexpected ways, impacting the accuracy of their predictions. Poisoning attacks against machine learning induce adversarial modification of data used by a machine ...
Data Poisoning Attacks on Crowdsourcing Learning
Web and Big Data
Abstract
Understanding and assessing the vulnerability of crowdsourcing learning against data poisoning attacks is the key to ensure the quality of classifiers trained from crowdsourced labeled data. Existing studies on data poisoning attacks only focus on ...
Defending Against Adversarial Denial-of-Service Data Poisoning Attacks
DYNAMICS '20: Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Data poisoning is one of the most relevant security threats against machine learning and data-driven technologies. Since many applications rely on untrusted training data, an attacker can easily craft malicious samples and inject them into the training ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Data Warehousing and Mining

International Journal of Data Warehousing and Mining Volume 20, Issue 1

Oct 2024

323 pages

EISSN:1548-3932

Issue’s Table of Contents

Copyright © 2024.

Publisher

IGI Global

United States

Publication History

Published: 16 November 2024

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents