[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications

Published: 06 October 2022 Publication History

Abstract

Deep Learning models’ performance strongly correlate with availability of annotated data; however, massive data labeling is laborious, expensive, and error-prone when performed by human experts. Active Learning (AL) effectively handles this challenge by selecting the uncertain samples from unlabeled data collection, but the existing AL approaches involve repetitive human feedback for labeling uncertain samples, thus rendering these techniques infeasible to be deployed in industry related real-world applications. In the proposed Proxy Model based Active Learning technique (PMAL), this issue is addressed by replacing human oracle with a deep learning model, where human expertise is reduced to label only two small subsets of data for training proxy model and initializing the AL loop. In the PMAL technique, firstly, proxy model is trained with a small subset of labeled data, which subsequently acts as an oracle for annotating uncertain samples. Secondly, active model's training, uncertain samples extraction via uncertainty sampling, and annotation through proxy model is carried out until predefined iterations to achieve higher accuracy and labeled data. Finally, the active model is evaluated using testing data to verify the effectiveness of our technique for practical applications. The correct annotations by the proxy model are ensured by employing the potentials of explainable artificial intelligence. Similarly, emerging vision transformer is used as an active model to achieve maximum accuracy. Experimental results reveal that the proposed method outperforms the state-of-the-art in terms of minimum labeled data usage and improves the accuracy with 2.2%, 2.6%, and 1.35% on Caltech-101, Caltech-256, and CIFAR-10 datasets, respectively. Since the proposed technique offers a highly reasonable solution to exploit huge multimedia data, it can be widely used in different evolutionary industrial domains.

References

[1]
Y. Wang, M. Fang, J. Tianyi Zhou, T. Mu, and D. Tao. 2021. Introduction to Big Multimodal Multimedia Data with Deep Analytics. 17, ed: ACM New York, NY, 2021, 1–3.
[2]
C. Xu, K. Wang, Y. Sun, S. Guo, and A. Y. Zomaya. 2018. Redundancy avoidance for big data in data centers: A conventional neural network approach. IEEE Transactions on Network Science and Engineering 7, 1 (2018), 104–114.
[3]
K. Muhammad et al. 2021. Fuzzy logic in surveillance big video data analysis: Comprehensive review, challenges, and research directions. ACM Computing Surveys (CSUR) 54, 3 (2021), 1–33.
[4]
B. Zhang, R. Zhang, N. Bisagno, N. Conci, F. G. De Natale, and H. Liu. 2021. Where are they going? Predicting human behaviors in crowded scenes. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 4 (2021), 1–19.
[5]
M. Tanveer, P. Khanna, M. Prasad, and C. Lin. 2020. Introduction to the Special Issue on Computational Intelligence for Biomedical Data and Imaging. 16, ed: ACM New York, NY, USA, 2020, 1–4.
[6]
A. Singh, A. Dhillon, N. Kumar, M. S. Hossain, G. Muhammad, and M. Kumar. 2021. eDiaPredict: An ensemble-based framework for diabetes prediction. ACM Transactions on Multimedia Computing Communications and Applications 17, 2s (2021), 1–26.
[7]
X. Tang, M. Liu, H. Zhong, Y. Ju, W. Li, and Q. Xu. 2021. Mill: Channel attention–based deep multiple instance learning for landslide recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 2s (2021), 1–11.
[8]
B. Kizilkaya, E. Ever, H. Y. Yatbaz, and A. Yazici. 2022. An effective forest fire detection framework using heterogeneous wireless multimedia sensor networks. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 2 (2022), 1–21.
[9]
H. Shahid et al. 2021. Machine learning-based Mist computing enabled internet of battlefield things. ACM Transactions on Internet Technology (TOIT) 21, 4 (2021), 1–26.
[10]
S. B. Prathiba, G. Raja, K. Dev, N. Kumar, and M. Guizani. 2021. A hybrid deep reinforcement learning for autonomous vehicles smart-platooning. IEEE Transactions on Vehicular Technology (2021).
[11]
I. U. Haq, K. Muhammad, T. Hussain, J. Del Ser, M. Sajjad, and S. W. Baik. 2021. QuickLook: Movie summarization using scene-based leading characters with psychological cues fusion. Information Fusion 76 (2021), 24–35.
[12]
S. Aloufi and A. E. Saddik. 2022. MMSUM digital twins: A multi-view multi-modality summarization framework for sporting events. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 1 (2022), 1–25.
[13]
H. Lu et al. 2021. An adaptive neural architecture search design for collaborative edge-cloud computing. IEEE Network 35, 5 (2021), 83–89.
[14]
C. Education. 2021. Data engineering, preparation, and labeling for AI 2019. https://www.cognilytica.com/document/report-data-engineering-preparation-and-labeling-for-ai-2019/ (accessed 29/11/2021, 2021).
[15]
I. Grand View Research. 2021. Data collection and labeling market worth $8.22 billion by 2028. https://www.grandviewresearch.com/press-release/global-data-collection-labeling-market (accessed 29/11/2021).
[16]
P. Ren et al. 2021. A survey of deep active learning. ACM Computing Surveys (CSUR) 54, 9 (2021), 1–40.
[17]
K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin. 2016. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology 27, 12 (2016), 2591–2600.
[18]
A. Alizadeh, P. Tavallali, M. R. Khosravi, and M. Singhal. 2021. Survey on recent active learning methods for deep learning. In Advances in Parallel & Distributed Processing, and Applications. Springer, 609–617.
[19]
B. Settles. 2009. Active learning literature survey. 2009.
[20]
D. Tuia, F. Ratle, F. Pacifici, M. F. Kanevski, and W. J. Emery. 2009. Active learning methods for remote sensing image classification. IEEE Transactions on Geoscience and Remote Sensing 47, 7 (2009), 2218–2232.
[21]
T. Fredriksson, D. I. Mattos, J. Bosch, and H. H. Olsson. 2020. Data labeling: An empirical investigation into industrial challenges and mitigation strategies. In International Conference on Product-Focused Software Process Improvement. Springer, 202–216.
[22]
O. Sener and S. Savarese. 2017. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489.
[23]
S. Ebrahimi et al. 2020. Minimax active learning. arXiv preprint arXiv:2012.10467.
[24]
K. Fujii and H. Kashima. 2016. Budgeted stream-based active learning via adaptive submodular maximization. Advances in Neural Information Processing Systems 29, 2016.
[25]
M. Ducoffe and F. Precioso. 2018. Adversarial active learning for deep networks: A margin based approach. arXiv preprint arXiv:1802.09841.
[26]
T. Tran, T.-T. Do, I. Reid, and G. Carneiro. 2019. Bayesian generative active deep learning. In International Conference on Machine Learning. PMLR, 6295–6304.
[27]
M. Kumar, B. Packer, and D. Koller. 2010. Self-paced learning for latent variable models. Advances in Neural Information Processing Systems 23, (2010).
[28]
Y. Cheng, Z. Chen, L. Liu, J. Wang, A. Agrawal, and A. Choudhary. 2013. Feedback-driven multiclass active learning for data streams. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 1311–1320.
[29]
J. Smailović, M. Grčar, N. Lavrač, and M. Žnidaršič. 2014. Stream-based active learning for sentiment analysis in the financial domain. Information Sciences 285 (2014), 181–203.
[30]
P. Kumar and A. Gupta. 2020. Active learning query strategies for classification, regression, and clustering: A survey. Journal of Computer Science and Technology 35, 4 (2020), 913–945.
[31]
S. Kee, E. Del Castillo, and G. Runger. 2018. Query-by-committee improvement with diversity and density in batch active learning. Information Sciences 454 (2018), 401–418.
[32]
X. Li and Y. Guo. 2013. Adaptive active learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 859–866.
[33]
A. Agrawal, S. Tripathi, and M. Vardhan. 2021. Active learning approach using a modified least confidence sampling strategy for named entity recognition. Progress in Artificial Intelligence 10, 2 (2021), 113–128.
[34]
E. Elhamifar, G. Sapiro, A. Yang, and S. S. Sasrty. 2013. A convex optimization framework for active learning. In Proceedings of the IEEE International Conference on Computer Vision. 209–216.
[35]
K. Brinker. 2003. Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning (ICML'03). 59–66.
[36]
C. Mayer and R. Timofte. 2020. Adversarial sampling for active learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3071–3079.
[37]
Y. Yang, Z. Ma, F. Nie, X. Chang, and A. G. Hauptmann. 2015. Multi-class active learning by uncertainty sampling with diversity maximization. International Journal of Computer Vision 113, 2 (2015), 113–127.
[38]
K. Kim, D. Park, K. I. Kim, and S. Y. Chun. 2021. Task-aware variational adversarial active learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8166–8175.
[39]
T. Yin, N. Liu, and H. Sun. 2021. Self-paced active learning for deep CNNs via effective loss function. Neurocomputing 424 (2021), 1–8.
[40]
J. W. Cho, D.-J. Kim, Y. Jung, and I. S. Kweon. 2021. MCDAL: Maximum classifier discrepancy for active learning. arXiv preprint arXiv:2107.11049.
[41]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[42]
B. D. Ripley. 2007. Pattern Recognition and Neural Networks. Cambridge University Press (2007).
[43]
Z. Wu, C. Shen, and A. Van Den Hengel. 2019. Wider or deeper: Revisiting the ResNet model for visual recognition. Pattern Recognition 90 (2019), 119–133.
[44]
C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu. 2018. A survey on deep transfer learning. In International Conference on Artificial Neural Networks. Springer, 270–279.
[45]
O. Russakovsky et al. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211–252.
[46]
A. Vaswani et al. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.
[47]
A. Dosovitskiy et al. 2020. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
[48]
S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah. 2021. Transformers in vision: A survey. arXiv preprint arXiv:2101.01169.
[49]
C. Schröder, A. Niekler, and M. Potthast. Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers.
[50]
A. Krizhevsky and G. Hinton. 2009. Learning multiple layers of features from tiny images. (2009).
[51]
F.-F. Li, R. Fergus, and P. Perona. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 4 (2006), 594–611.
[52]
G. Griffin, A. Holub, and P. Perona. 2007. Caltech-256 object category dataset. (2007).
[53]
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[54]
M. Tan and Q. Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. Presented at the Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research (2019). [Online]. Available: https://proceedings.mlr.press/v97/tan19a.html.

Cited By

View all
  • (2024)Recent Applications of Explainable AI (XAI): A Systematic Literature ReviewApplied Sciences10.3390/app1419888414:19(8884)Online publication date: 2-Oct-2024
  • (2023)Tracking and handling behavioral biases in active learning frameworksInformation Sciences: an International Journal10.1016/j.ins.2023.119117641:COnline publication date: 1-Sep-2023
  • (2023)Deep learning based active learning technique for data annotation and improve the overall performance of classification models▪Expert Systems with Applications: An International Journal10.1016/j.eswa.2023.120391228:COnline publication date: 15-Oct-2023

Index Terms

  1. PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2s
    June 2022
    383 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3561949
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 October 2022
    Online AM: 21 June 2022
    Accepted: 29 April 2022
    Revised: 12 April 2022
    Received: 31 December 2021
    Published in TOMM Volume 18, Issue 2s

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Active learning
    2. convolution neural networks
    3. data analytics
    4. image classification
    5. proxy model
    6. patterns matching
    7. uncertainty sampling
    8. vision transformer

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • Institute of Information & Communications Technology Planning & Evaluation (IITP)
    • Korea government (MSIT)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)112
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Recent Applications of Explainable AI (XAI): A Systematic Literature ReviewApplied Sciences10.3390/app1419888414:19(8884)Online publication date: 2-Oct-2024
    • (2023)Tracking and handling behavioral biases in active learning frameworksInformation Sciences: an International Journal10.1016/j.ins.2023.119117641:COnline publication date: 1-Sep-2023
    • (2023)Deep learning based active learning technique for data annotation and improve the overall performance of classification models▪Expert Systems with Applications: An International Journal10.1016/j.eswa.2023.120391228:COnline publication date: 15-Oct-2023
    • (2022)Vision-Based Semantic Segmentation in Scene Understanding for Autonomous Driving: Recent Achievements, Challenges, and OutlooksIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2022.320766523:12(22694-22715)Online publication date: Dec-2022

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media