More Web Proxy on the site http://driver.im/

research-article

Harmful Information Detection of Web pages with Attentional Deep Neural Networks and Multi-level Feature Fusion

Authors:

Quanjiang Shen,

Chang LiuAuthors Info & Claims

AISNS '23: Proceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security

Pages 370 - 374

https://doi.org/10.1145/3661638.3661707

Published: 01 June 2024 Publication History

AISNS '23: Proceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security

Harmful Information Detection of Web pages with Attentional Deep Neural Networks and Multi-level Feature Fusion

Pages 370 - 374

Abstract
References

Abstract

With the rapid development of the Internet in our country, our country has gradually realized the informatization of society and formed a digital society through a huge population of netizens. In this digital society, websites are an important means for people to obtain information services. Our country has a large number of websites, and each website provides a large number of web pages. However, many of these websites are not effectively maintained. Every year, a large number of websites are hacked, causing web pages to be tampered with, and the tampered web pages basically spread harmful information. In addition, outside of government supervision, there are a large number of overseas websites that specialize in spreading harmful information. This large amount of harmful information that floods the Internet continues to erode and endanger normal social life. Therefore, relevant government departments are required to supervise and manage this problem. The means of supervision is to identify and detect harmful information content on the web pages published by each website. This paper focuses on the problem of classifying web pages for harmful information content in web pages, conducts stop word mining for web page text content, reduces the noise in the data by removing stop words unique to web page text, and proposes a method based on the classification model of deep attention learning and multi-level feature fusion achieves better classification and prediction effects on web page text. This paper conducts a large number of comparative experiments on real data sets, and the experimental results demonstrate the effectiveness of this method.

References

[1]

China Network Interconnection Information Center, The 47th Statistical Report on China's Internet Development, 2021.

[2]

D Cheng, X Wang, Y Zhang, and L Zhang, “Risk guarantee prediction in networked-loans,” International Joint Conference on Artificial Intelligence, 2020, pp. 4483-4489.

[3]

Minaee S, Kalchbrenner N, Cambria E, Deep learning–based text classification: a comprehensive review[J]. ACM computing surveys (CSUR), 2021, 54(3): 1-40.

[4]

Azar J, Makhoul A, Couturier R, Robust IoT time series classification with data compression and deep learning[J]. Neurocomputing, 2020, 398: 222-234.

[5]

Yaneva V, Eraslan S, Yesilada Y, Detecting high-functioning autism in adults using eye tracking and machine learning[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020, 28(6): 1254-1261.

[6]

D Cheng, C Chen, X Wang, and S Xiang, “Efficient top-k vulnerable nodes detection in uncertain graphs,” IEEE Transactions on Knowledge and Data Engineering, 2023, pp. 1460-1472.

[7]

Basiri M E, Abdar M, Cifci M A, A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques[J]. Knowledge-Based Systems, 2020, 198: 105949.

[8]

Zhao T, Zhang X, Wang S. Graphsmote: Imbalanced node classification on graphs with graph neural networks[C]//Proceedings of the 14th ACM international conference on web search and data mining. 2021: 833-841.

[9]

P Zhu, D Cheng, S Luo, R Xu, Y Liang, and Y Luo, “Leveraging enterprise knowledge graph to infer web events' influences via self-supervised learning,” Journal of Web Semantics, 2022, p. 100722.

Digital Library

[10]

J Ma, L. K. Saul, S Savage, and G Voelker, “Identifying suspicious URLs: an application of large-scale online learning,” Annual International Conference on Machine Learning, 2009, pp. 681-688.

Digital Library

[11]

Y Hou, Y Chang, T Chen, C Laih, and C Chen, “Malicious web content detection by machine learning,” Expert Systems With Applications, 2010, pp. 55-60.

Digital Library

[12]

L Araujo, J Martinez-Romo, “Web spam detection: new classification features based on qualified link analysis and language models,” IEEE Transactions on Information Forensics and Security, 2010, pp. 581-590.

Digital Library

[13]

Z Niu, R Li, J Wu, D Cheng and J Zhang, “iConViz: Interactive Visual Exploration of the Default Contagion Risk of Networked-Guarantee Loans,” IEEE Conference on Visual Analytics Science and Technology, 2020, pp. 84-94.

[14]

Ashtiani M N, Raahemi B. News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review[J]. Expert Systems with Applications, 2023, 217: 119509.

Digital Library

[15]

Breit A, Waltersdorfer L, Ekaputra F J, Combining machine learning and semantic web: A systematic mapping study[J]. ACM Computing Surveys, 2023, 55(14s): 1-41.

Digital Library

[16]

Bharadiya J P. A comparative study of business intelligence and artificial intelligence with big data analytics[J]. American Journal of Artificial Intelligence, 2023, 7(1): 24.

[17]

Li Q, Peng H, Li J, A survey on text classification: From traditional to deep learning[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2022, 13(2): 1-41.

Digital Library

[18]

Zhou Wenwen, Han Bin, and Huang Shucheng, “Research on web page classification algorithm combining text semantic map and word frequency statistics,” Computers and Digital Engineering, 2020, pp. 1265-1268.

[19]

Yu Mingyang, Yang Peng, and Wang Yijun, “Pornographic image detection based on convolutional neural network,” Computer Applications and Software, 2018, pp. 232-236.

[20]

P Zhu, D Cheng, S Luo, F Yang, Y Luo, W Qian, and A Zhou, “SI-News: Integrating social information for news recommendation with attention-based graph convolutional network. Neurocomputing,”2022, pp. 33-42.

[21]

Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov, “ Bag of Tricks for Efficient Text Classification,” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017, pp. 427–431.

[22]

A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, A. N. Gomez, Ł Kaiser, and I Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, 2017, pp. 1-11

Index Terms

Harmful Information Detection of Web pages with Attentional Deep Neural Networks and Multi-level Feature Fusion
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Crawling deep web entity pages
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining

Deep-web crawl is concerned with the problem of surfacing hidden content behind search interfaces on the Web. While many deep-web sites maintain document-oriented textual content (e.g., Wikipedia, PubMed, Twitter, etc.), which has traditionally been the ...
Creating Web Pages for Dummies
Removal of Noisy Information in Web Pages
ICTCS '14: Proceedings of the 2014 International Conference on Information and Communication Technology for Competitive Strategies

Internet today has made the life of human beings almost completely dependent on it. Internet contains information regarding almost everything and anything. Among the information that a webpage contains, noises and non-essential information also ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

AISNS '23: Proceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security

December 2023

467 pages

ISBN:9798400716966

DOI:10.1145/3661638

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

AISNS 2023

AISNS 2023: 2023 International Conference on Artificial Intelligence, Systems and Network Security

December 22 - 24, 2023

Mianyang, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
31
Total Downloads

Downloads (Last 12 months)31
Downloads (Last 6 weeks)10

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten