[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3656766.3656970acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbarConference Proceedingsconference-collections
research-article

A Detection and Classification Method of Asphalt Pavement Crack based on Vision Transformer

Published: 01 June 2024 Publication History

Abstract

Road surface distress detection is an important component of smart maintenance of transportation infrastructure, and crack damage is a common and far-reaching category of distress. Accurately and quickly detecting road cracks is of great significance for road prevention work. In recent years, the deep learning technology represented by convolutional neural networks has been widely used in the field of road damage detection. However, the convolutional neural network relies on a stack of convolutional layers to extract visual features from training images. Constrained by its ability to characterize long-distance correlations, it is challenging to apply convolutional neural network to tackle transverse cracks and longitudinal cracks in the real-world scenario. As an emerging deep learning architecture, Transformer model has received considerable attention in the fields of natural language processing and computer vision. In this paper, we employ the Vision Transformer (ViT) to realize road damage classification. First, the histogram equalization technique is adopted to fulfill image preprocessing. The program enhances image contrast and eliminates the influence of illumination variation. Second, ViT, ResNet, DenseNet, and EfficientNet are separately implemented. We expand the training dataset with the data augmentation technique. Third, the computer evaluates the classification quality by means of accuracy, F1-score, and recall. A group of datasets of varying size are used to examine deep neural networks. The experiment result indicates that ViT outperforms CNN models in terms of classification quality. The long-range crack structures are reasonably identified by a fine-tune ViT model.

References

[1]
Wang, H.-P., 2021. "Review on structural damage rehabilitation and performance assessment of asphalt pavements." Reviews on Advanced Materials Science 60(1): 438-449. https://doi.org/10.1515/rams-2021-0030.
[2]
Tsai, Y.-C., 2021. "Automatically detect and classify asphalt pavement raveling severity using 3D technology and machine learning." International Journal of Pavement Research and Technology 14: 487-495. https://doi.org/10.1007/s42947-020-0138-5.
[3]
Gopalakrishnan, K. 2018. "Deep learning in data-driven pavement image analysis and automated distress detection: A review." Data 3(3): 28. https://doi.org/10.3390/data3030028
[4]
LeCun, Y., 1998. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86(11): 2278-2324.
[5]
Tong, Z., 2018. "Recognition of asphalt pavement crack length using deep convolutional neural networks." Road Materials and Pavement Design 19(6): 1334-1349. https://doi.org/10.1080/14680629.2017.1308265.
[6]
Dosovitskiy, A., 2020. "An image is worth 16x16 words: Transformers for image recognition at scale." arXiv preprint arXiv:2010.11929. https://doi.org/10.48550/arXiv.2010.11929.
[7]
Vaswani, A., 2017. "Attention Is All You Need." arXiv.
[8]
Touvron, H., 2021. Training data-efficient image transformers & distillation through attention. International conference on machine learning, PMLR. 2021: 10347-10357. https://proceedings.mlr.press/v139/touvron21a.html
[9]
Guo, F., 2023. "Pavement crack detection based on transformer network." Automation in Construction 145: 104646. https://doi.org/10.1016/j.autcon.2022.104646.
[10]
He, K., 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[11]
Huang, G., 2017. "Densely Connected Convolutional Networks." IEEE Computer Society. 2017: 4700-4708. https://doi.org/10.48550/arXiv.1608.06993.
[12]
Tan, M. and Q. V. Le 2019. "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks." 2019: 6105-6114. https://doi.org/10.48550/arXiv.1905.11946.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBAR '23: Proceedings of the 2023 3rd International Conference on Big Data, Artificial Intelligence and Risk Management
November 2023
1156 pages
ISBN:9798400716478
DOI:10.1145/3656766
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2024

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICBAR 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 21
    Total Downloads
  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media