[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Securing Machine Learning Against Data Poisoning Attacks

Published: 13 December 2024 Publication History

Abstract

The emergence of intelligent networks has revolutionized the use of machine learning (ML), allowing it to be applied in various domains of human life. This literature review paper provides in-depth analysis of the existing research on data poisoning attacks and examines how intelligent networks can mitigate these threats. Specifically, the author explores how malicious users inject fake training data into adversarial networks, a technique known as a data poisoning attack, which can severely compromise the model's integrity. Through a comparative evaluation of the attack strategies and defense mechanisms, such as robust optimization and model-based detection, the author assesses the strengths and limitations of current defenses. Real-world applications are discussed, including the use of these networks in cybersecurity, healthcare, and smart city systems. The author concludes by outlining the challenges and future directions in developing more effective defense strategies to detect and mitigate data poisoning attacks in real time, ensuring the security and privacy of intelligent networks.

References

[1]
Adi, Y., Baum, C., Cisse, M., Pinkas, B., & Keshet, J. (2018). Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In SEC ’18: Proceedings of the 27th USENIX Security Symposium (pp. 1615–1631). USENIX Association., https://www.usenix.org/conference/usenixsecurity18/presentation/adi.
[2]
Aghakhani, H., Eisenhofer, T., Schönherr, L., Kolossa, D., Holz, T., Krüegel, C., & Vigna, G. (2020). VenoMave: Targeted poisoning against speech recognition. arXiv:2010.10682 [cs.SD]. https://doi.org//arXiv.2010.1068210.48550
[3]
Aghakhani, H., Meng, D., Wang, Y.-X., Krüegel, C., & Vigna, G. (2020). Bullseye polytope: A scalable clean-label poisoning attack with improved transferability. arXiv (preprint) arXiv:2005.00191 [cs.LG]. https://doi.org//arXiv.2005.0019110.48550
[4]
Ahmed, I. M., & Kashmoola, M. Y. (2021). Threats on machine learning technique by data poisoning attack: A survey. In N. Abdullah, S. Manickam, & M. Anbar (Eds.), Advances in Cyber Security. ACeS 2021 (pp. 586–600). Communications in Computer and Information Science, vol 1487. Springer, Singapore.
[5]
Al Ghamdi, M. A., Bhatti, M. S., Saeed, A., Gillani, Z., & Almotiri, S. H. (2024). A fusion of BERT, machine learning, and manual approach for fake news detection. Multimedia Tools and Applications, 83(10), 30095–30112.
[6]
Almotiri, S. H., Nadeem, M., Al Ghamdi, M. A., & Khan, R. A. (2023). Analytic review of healthcare software by using quantum computing security techniques. International Journal of Fuzzy Logic and Intelligent Systems, 23(3), 336–352.
[7]
Alsuwat, E., Alsuwat, H., Valtorta, M., & Farkas, C. (2019). Adversarial data poisoning attacks against the PC learning algorithm. International Journal of General Systems, 49(1), 3–31.
[8]
Alzahrani, S., Alsuwat, H., & Alsuwat, E. (2024). Evaluating the efficacy of latent variables in mitigating data poisoning attacks in the context of Bayesian networks: An empirical study. Computer Modeling in Engineering & Sciences, 139(2), 1635–1654.
[9]
Andreina, S., Marson, G. A., Möllering, H., & Karame, G. (2020). BaFFle: Backdoor detection via feedback-based federated learning. arXiv (preprint). arXiv:2011.02167. https://doi.org//arXiv.2011.0216710.48550
[10]
Ashcraft, C., & Karra, K. (2021). Poisoning deep reinforcement learning agents with in-distribution triggers. arXiv (preprint). arXiv:2106.07798. https://doi.org//arXiv.2106.0779810.48550
[11]
Chen, J., Zhang, X., Zhang, R., Wang, C., & Liu, L. (2021). De-pois: An attack-agnostic defense against data poisoning attacks. IEEE Transactions on Information Forensics and Security, 16, 3412–3425.
[12]
Comiter, M. (2019). Attacking artificial intelligence: AI’s security vulnerability and what policymakers can do about it. Belfer Center for Science and International Affairs, Harvard Kennedy School.
[13]
Deng, Y., Zhang, T., Lou, G., Zheng, X., Jin, J., & Han, Q.-L. (2021). Deep learning-based autonomous driving systems: A survey of attacks and defenses. IEEE Transactions on Industrial Informatics, 17(12), 7897–7912.
[14]
Dineen, J., Ahsan-Ul Haque, A. S. M., & Bielskas, M. (2021). Reinforcement learning for data poisoning on graph neural networks. In Thomson, R., Hussain, M. N., Dancy, C., & Pyke, A. (Eds.), Lecture Notes in Computer Science: Vol. 12720. Social, cultural, and behavioral modeling. SBP-BRiMS 2021 (pp. 141–150). Springer.
[15]
Dixit, P., & Silakari, S. (2021). Deep learning algorithms for cybersecurity applications: A technological and status review. Computer Science Review, 39, 100317.
[16]
DokuR.RawatD. B. (2021). Mitigating data poisoning attacks on a federated learning-edge computing network. In Proceedings of the 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6. IEEE. 10.1109/CCNC49032.2021.9369581
[17]
EykholtK.EvtimovI.FernandesE.LiB.RahmatiA.XiaoC.PrakashA.KohnoT.SongD. (2018). Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1625–1634. IEEE. 10.1109/CVPR.2018.00175
[18]
Goldblum, M., Tsipras, D., Xie, C., Chen, X., Schwarzschild, A., Song, D., Madry, A., Li, B., & Goldstein, T. (2023). Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(02), 1563–1580. 35333711.
[19]
Gu, T., Dolan-Gavitt, B., & Garg, S. (2017). BadNets: Identifying vulnerabilities in the machine learning model supply chain. arXiv (preprint). arXiv:1708.06733 [cs.CR]. https://doi.org//arXiv.1708.0673310.48550
[20]
HitajB.AtenieseG.Perez-CruzF. (2017). Deep models under the GAN: Information leakage from collaborative deep learning. In CCS ’17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618. Association for Computing Machinery. 10.1145/3133956.3134012
[21]
Horkoff, J. (2019). Non-functional requirements for machine learning: Challenges and new directions. In 2019 IEEE 27th International Requirements Engineering Conference (RE), pp. 386–391. IEEE. 10.1109/RE.2019.00050
[22]
Huang, H., Mu, J., Gong, N. Z., Li, Q., Liu, B., & Xu, M. (2021). Data poisoning attacks to deep learning based recommender systems. arXiv (preprint). arXiv:2101.02644. 10.14722/ndss.2021.24525
[23]
LiD.WongW. E.WangW.YaoY.ChauM. (2021). Detection and mitigation of label-flipping attacks in federated learning systems with KPCA and K-means. In Proceedings of the 2021 8th International Conference on Dependable Systems and Their Applications (DSA), pp. 551–559. IEEE. 10.1109/DSA52907.2021.00081
[24]
Liu, X., Xie, L., Wang, Y., Zou, J., Xiong, J., Ying, Z., & Vasilakos, A. V. (2020). Privacy and security issues in deep learning: A survey. IEEE Access : Practical Innovations, Open Solutions, 9, 4566–4593.
[25]
Liu, Y., Ma, S., Aafer, Y., Lee, W.-C., Zhai, J., Wang, W., & Zhang, X. (2018). Trojaning attack on neural networks. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium, pp. 1–15. 10.14722/ndss.2018.23291
[26]
Long, Y., Bindschaedler, V., Wang, L., Bu, D., Wang, X., Tang, H., Gunter, C. A., & Chen, K. (2018). Understanding membership inferences on well-generalized learning models. arXiv:1802.04889 [cs.CR]. https://doi.org//arXiv.1802.0488910.48550
[27]
LowdD.MeekC. (2005). Good word attacks on statistical spam filters. In Proceedings of the Second Conference on Email and Anti-Spam (CEAS), pp. 1–8.
[28]
Ma, X., Li, B., Wang, Y., Erfani, S. M., Wijewickrema, S., Schoenebeck, G., Song, D., Houle, M. E., & Bailey, J. (2018). Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv (preprint). arXiv:1801.02613 [cs.LG]. https://doi.org//arXiv.1801.0261310.48550
[29]
Mehra, A., Kailkhura, B., Chen, P.-Y., & Hamm, J. (2021). Understanding the limits of unsupervised domain adaptation via data poisoning. In NIPS ’21: Proceedings of the 35th International Conference on Neural Information Processing Systems, 1327, pp. 17347–17359. https://proceedings.neurips.cc/paper/2021/file/90cc440b1b8caa520c562ac4e4bbcb51-Paper.pdf
[30]
MelisL.SongC.De ChristofaroE.ShmatikovV. (2019). Exploiting unintended feature leakage in collaborative learning. In Proceedings of the 40th IEEE Symposium on Security & Privacy (S&P), pp. 691–706. arXiv:1805.04049 [cs.CR]. 10.1109/SP.2019.00029
[31]
Miller, D. J., Xiang, Z., & Kesidis, G. (2020). Adversarial learning targeting deep neural network classification: A comprehensive review of defenses against attacks. Proceedings of the IEEE, 108(3), 402–433.
[32]
NasrM.ShokriR.HoumansadrA. (2019). Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 739–753. IEEE. 10.1109/SP.2019.00065
[33]
Nweke, L. O. (2017). Using the CIA and AAA models to explain cybersecurity activities. PM World Journal, 6(12), 1–3. https://pmworldlibrary.net/wp-content/uploads/2017/05/171126-Nweke-Using-CIA-and-AAA-Models-to-explain-Cybersecurity.pdf
[34]
Qiu, W. (2022). A survey on poisoning attacks against supervised machine learning. arXiv (preprint). arXiv:2202.02510 [cs.CR]. https://doi.org//arXiv.2202.0251010.48550
[35]
Ramirez, M. A., Kim, S.-K., Hamadi, H. A., Damiani, E., Byon, Y.-J., Kim, T.-Y., Cho, C.-S., & Yeun, C. Y. (2022). Poisoning attacks and defenses on artificial intelligence: A survey. arXiv (preprint). arXiv:2202.10276 [cs.CR]. https://doi.org//arXiv.2202.1027610.48550
[36]
Rawal, A., Rawat, D., & Sadler, B. M. (2021). Recent advances in adversarial machine learning: Status, challenges and perspectives. Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III, 117462Q. SPIE.
[37]
Razmi, F., & Xiong, L. (2021). Classification auto-encoder based detector against diverse data poisoning attacks. arXiv (preprint). arXiv:2108.04206 [cs.LG]. https://doi.org//arXiv.2108.0420610.48550
[38]
Shan, S., Bhagoji, A. N., Zheng, H., & Zhao, B. Y. (2021). Poison tracking: Traceback of data poisoning attacks in neural networks. In Proceedings of the 31st USENIX Security Symposium. USENIX. https://researchr.org/publication/ShanB0Z22
[39]
ShokriR.StronatiM.SongC.ShmatikovV. (2017). Membership inference attacks against machine learning models. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE. 10.1109/SP.2017.41
[40]
Sun, G., Cong, Y., Dong, J., Wang, Q., Lyu, L., & Liu, J. (2021). Data poisoning attacks on federated machine learning. IEEE Internet of Things Journal, 9(13), 11365–11375.
[41]
SunZ.DuX.SongF.NiM.LiL. (2022). CoProtector: Protect open-source code against unauthorized training usage with data poisoning. In WWW ’22: Proceedings of the ACM Web Conference 2022, pp. 652–660. 10.1145/3485447.3512225
[42]
Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., & Ristenpart, T. (2016). Stealing machine learning models via prediction APIs. In Proceedings of the 25th USENIX Security Symposium (pp. 601–618). USENIX Association., https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf.
[43]
Uprety, A., & Rawat, D. B. (2021). Mitigating poisoning attack in federated learning. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 01–07. IEEE. 10.1109/SSCI50451.2021.9659839
[44]
Usynin, D., Ziller, A., Makowski, M., Braren, R., Rueckert, D., Glocker, B., Kaissis, G., & Passerat-Palmbach, J. (2021). Adversarial interference and its mitigations in privacy-preserving collaborative machine learning. Nature Machine Intelligence, 3(9), 749–758.
[45]
Verde, L., Marulli, F., & Marrone, S. (2021). Exploring the impact of data poisoning attacks on machine learning model reliability. Procedia Computer Science, 192, 2624–2632.
[46]
Vinyals, O., & et. al., (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350–354. 31666705.
[47]
WangB.GongN. Z. (2018). Stealing hyperparameters in machine learning. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), pp. 36–52. IEEE. 10.1109/SP.2018.00038
[48]
Wang, C., Chen, J., Yang, Y., Ma, X., & Liu, J. (2022). Poisoning attacks and countermeasures in intelligent networks: Status quo and prospects. Digital Communications and Networks, 8(2), 225–234.
[49]
Wang, Z., Kang, Q., Zhang, X., & Hu, Q. (2022). Defense strategies toward model poisoning attacks in federated learning: A survey. arXiv (preprint). arXiv:2202.06414 [cs.CR]. /arXiv.2202.0641410.1109/WCNC51071.2022.9771619
[50]
Wazid, M., Das, A. K., Chamola, V., & Park, Y. (2022). Uniting cyber security and machine learning: Advantages, challenges and future research. ICT Express, 8(3), 313–321.
[51]
Xiong, J., Bi, R., Zhao, M., Guo, J., & Yang, Q. (2020). Edge-assisted privacy-preserving raw data sharing framework for connected autonomous vehicles. IEEE Wireless Communications, 27(3), 24–30.
[52]
Zhang, J., Pan, L., Han, Q.-L., Chen, C., Wen, S., & Xiang, Y. (2021). Deep learning based attack detection for cyber-physical system cybersecurity: A survey. IEEE/CAA Journal of Automatica Sinica, 9(3), 377–391. 10.1109/JAS.2021.1004261
[53]
Zhu, C., Huang, W. R., Shafahi, A., Li, H., Taylor, G., Studer, C., & Goldstein, T. (2019). Transferable clean-label poisoning attacks on deep neural nets. arXiv (preprint). arXiv:1905.05897 [stat.ML]. https://doi.org//arXiv.1905.0589710.48550

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Data Warehousing and Mining
International Journal of Data Warehousing and Mining  Volume 20, Issue 1
Oct 2024
323 pages

Publisher

IGI Global

United States

Publication History

Published: 13 December 2024

Author Tags

  1. Adversarial Machine Learning
  2. Data Poisoning Attack
  3. Defense Strategies
  4. Emerging Security Challenges
  5. Security Threats

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media