[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3661638.3661707acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaisnsConference Proceedingsconference-collections
research-article

Harmful Information Detection of Web pages with Attentional Deep Neural Networks and Multi-level Feature Fusion

Published: 01 June 2024 Publication History

Abstract

With the rapid development of the Internet in our country, our country has gradually realized the informatization of society and formed a digital society through a huge population of netizens. In this digital society, websites are an important means for people to obtain information services. Our country has a large number of websites, and each website provides a large number of web pages. However, many of these websites are not effectively maintained. Every year, a large number of websites are hacked, causing web pages to be tampered with, and the tampered web pages basically spread harmful information. In addition, outside of government supervision, there are a large number of overseas websites that specialize in spreading harmful information. This large amount of harmful information that floods the Internet continues to erode and endanger normal social life. Therefore, relevant government departments are required to supervise and manage this problem. The means of supervision is to identify and detect harmful information content on the web pages published by each website. This paper focuses on the problem of classifying web pages for harmful information content in web pages, conducts stop word mining for web page text content, reduces the noise in the data by removing stop words unique to web page text, and proposes a method based on the classification model of deep attention learning and multi-level feature fusion achieves better classification and prediction effects on web page text. This paper conducts a large number of comparative experiments on real data sets, and the experimental results demonstrate the effectiveness of this method.

References

[1]
China Network Interconnection Information Center, The 47th Statistical Report on China's Internet Development, 2021.
[2]
D Cheng, X Wang, Y Zhang, and L Zhang, “Risk guarantee prediction in networked-loans,” International Joint Conference on Artificial Intelligence, 2020, pp. 4483-4489.
[3]
Minaee S, Kalchbrenner N, Cambria E, Deep learning–based text classification: a comprehensive review[J]. ACM computing surveys (CSUR), 2021, 54(3): 1-40.
[4]
Azar J, Makhoul A, Couturier R, Robust IoT time series classification with data compression and deep learning[J]. Neurocomputing, 2020, 398: 222-234.
[5]
Yaneva V, Eraslan S, Yesilada Y, Detecting high-functioning autism in adults using eye tracking and machine learning[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020, 28(6): 1254-1261.
[6]
D Cheng, C Chen, X Wang, and S Xiang, “Efficient top-k vulnerable nodes detection in uncertain graphs,” IEEE Transactions on Knowledge and Data Engineering, 2023, pp. 1460-1472.
[7]
Basiri M E, Abdar M, Cifci M A, A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques[J]. Knowledge-Based Systems, 2020, 198: 105949.
[8]
Zhao T, Zhang X, Wang S. Graphsmote: Imbalanced node classification on graphs with graph neural networks[C]//Proceedings of the 14th ACM international conference on web search and data mining. 2021: 833-841.
[9]
P Zhu, D Cheng, S Luo, R Xu, Y Liang, and Y Luo, “Leveraging enterprise knowledge graph to infer web events' influences via self-supervised learning,” Journal of Web Semantics, 2022, p. 100722.
[10]
J Ma, L. K. Saul, S Savage, and G Voelker, “Identifying suspicious URLs: an application of large-scale online learning,” Annual International Conference on Machine Learning, 2009, pp. 681-688.
[11]
Y Hou, Y Chang, T Chen, C Laih, and C Chen, “Malicious web content detection by machine learning,” Expert Systems With Applications, 2010, pp. 55-60.
[12]
L Araujo, J Martinez-Romo, “Web spam detection: new classification features based on qualified link analysis and language models,” IEEE Transactions on Information Forensics and Security, 2010, pp. 581-590.
[13]
Z Niu, R Li, J Wu, D Cheng and J Zhang, “iConViz: Interactive Visual Exploration of the Default Contagion Risk of Networked-Guarantee Loans,” IEEE Conference on Visual Analytics Science and Technology, 2020, pp. 84-94.
[14]
Ashtiani M N, Raahemi B. News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review[J]. Expert Systems with Applications, 2023, 217: 119509.
[15]
Breit A, Waltersdorfer L, Ekaputra F J, Combining machine learning and semantic web: A systematic mapping study[J]. ACM Computing Surveys, 2023, 55(14s): 1-41.
[16]
Bharadiya J P. A comparative study of business intelligence and artificial intelligence with big data analytics[J]. American Journal of Artificial Intelligence, 2023, 7(1): 24.
[17]
Li Q, Peng H, Li J, A survey on text classification: From traditional to deep learning[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2022, 13(2): 1-41.
[18]
Zhou Wenwen, Han Bin, and Huang Shucheng, “Research on web page classification algorithm combining text semantic map and word frequency statistics,” Computers and Digital Engineering, 2020, pp. 1265-1268.
[19]
Yu Mingyang, Yang Peng, and Wang Yijun, “Pornographic image detection based on convolutional neural network,” Computer Applications and Software, 2018, pp. 232-236.
[20]
P Zhu, D Cheng, S Luo, F Yang, Y Luo, W Qian, and A Zhou, “SI-News: Integrating social information for news recommendation with attention-based graph convolutional network. Neurocomputing,”2022, pp. 33-42.
[21]
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov, “ Bag of Tricks for Efficient Text Classification,” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017, pp. 427–431.
[22]
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, A. N. Gomez, Ł Kaiser, and I Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, 2017, pp. 1-11

Index Terms

  1. Harmful Information Detection of Web pages with Attentional Deep Neural Networks and Multi-level Feature Fusion

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AISNS '23: Proceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security
    December 2023
    467 pages
    ISBN:9798400716966
    DOI:10.1145/3661638
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AISNS 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 31
      Total Downloads
    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media