[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3474085.3475560acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Deep Marginal Fisher Analysis based CNN for Image Representation and Classification

Published: 17 October 2021 Publication History

Abstract

Deep Convolutional Neural Networks (CNNs) have achieved great success in image classification. While conventional CNNs optimized with iterative gradient descent algorithms with large data have been widely used and investigated, there is also research focusing on learning CNNs with non-iterative optimization methods such as the principle component analysis network (PCANet). It is very simple and efficient but achieves competitive performance for some image classification tasks especially on tasks with only a small amount of data available. This paper further extends this line of research and proposes a deep Marginal Fisher Analysis (MFA) based CNN, termed as DMNet. It addresses the limitation of PCANet like CNNs when the samples do not follow Gaussian distribution, by using a local MFA for CNN filter optimization. It uses a graph embedding framework for convolution filter optimization by maximizing the inter-class discriminability among marginal points while minimizing intra-class distance. Cascaded MFA convolution layers can be used to construct a deep network. Moreover, a binary stochastic hashing is developed by randomly selecting features with a probability based on the importance of feature maps for binary hashing. Experimental results demonstrate that the proposed method achieves state-of-the-art result in non-iterative optimized CNN methods, and ablation studies have been conducted to verify the effectiveness of the proposed modules in our DMNet.

Supplementary Material

MP4 File (MM21-fp2096.mp4)
With extraordinary feature representation ability, deep metric learning has been widely used. On the other hand, traditional metric learning methods which are optimized with non-iterative optimization method, have proven efficiency in optimization. Some researches take advantage of the conventional feature extraction methods into the deep learning framework such as the principle component analysis network (PCANet). It is simple and efficient but achieves competitive performance for some image classification tasks. This paper extends this line of research and proposes a deep Marginal Fisher Analysis based CNN. By using a graph embedding framework for convolution filter optimization, it addresses the limitation of PCANet like CNNs which formulated under the Gaussian distribution assumption. Moreover, a binary stochastic hashing is developed for feature selection based on the importance of feature maps. Experiments show the proposed method achieves state-of-the-art result in non-iterative optimized CNN methods.

References

[1]
Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. Sphere Face: Deep Hypersphere Embedding for Face Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 6738--6746.
[2]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (June 2017), 1137--1149.
[3]
Chen Shen, Zhongming Jin, Yiru Zhao, Zhihang Fu, Rongxin Jiang, Yaowu Chen, and Xian-Sheng Hua. 2017. Deep Siamese Network with Multi-level Similarity Perception for Person Re-identification. In Proceedings of the 25th ACM international conference on Multimedia (MM '17), October, 2017, New York, NY, USA, 1942--1950.
[4]
Shikang Gan, Yong Luo, Yonggang Wen, Tongliang Liu, and Han Hu. 2020. Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20), October, 2020, New York, NY, USA, 1837--1845.
[5]
Shuai Li, Wanqing Li, Chris Cook, Ce Zhu, and Yanbo Gao. 2018. Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18-23, 2018, Salt Lake City, UT, USA, 5457--5466.
[6]
Fumin Shen, Xiang Zhou, Jun Yu, Yang Yang, Li Liu, Heng Tao Shen. 2019. Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings. IEEE Transactions on Image Processing, 28, 7 (July 2019), 3662--3674.
[7]
Xing Xu, Huimin Lu, Jingkuan Song, Yang Yang, Heng Tao Shen, and Xuelong Li. 2020. Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval. IEEE Transactions on Cybernetics 50, 6 (June 2020), 2400--2413.
[8]
Ziqiang Zheng, Zhibin Yu, Haiyong Zheng, Yang Yang, and Heng Tao Shen. 2021. One-Shot Image-to-Image Translation via Part-Global Learning with a Multi-adversarial Framework. IEEE Transactions on Multimedia.
[9]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 2261--2269.
[10]
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Device. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18-23, 2018, Salt Lake City, UT, USA, 6848--6856.
[11]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18-23, 2018, Salt Lake City, UT, USA, 7132--7141.
[12]
Thomas Wiatowski and Helmut Bölcskei. 2018. A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction. IEEE Transactions on Information Theory 64, 3 (March 2018), 1845--1866.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA, 770--778.
[14]
François Chollet. 2017. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 1800--1807.
[15]
Jian Wang, Feng Zhou, Shilei Wen, Xiao Liu, and Yuanqing Lin. 2017. Deep Metric Learning with Angular Loss. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct. 22-29, 2017, Venice, Italy, 2612--2620.
[16]
Zhengming Ding and Yun Fu. 2017. Robust Transfer Metric Learning for Image Classification. IEEE Transactions on Image Processing 26, 2 (Feb. 2017), 660--670.
[17]
Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, and Kevin Murphy. 2017. Deep Metric Learning via Facility Location. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21--26, 2017, Honolulu, HI, USA, 2206--2214.
[18]
Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, and Matthew R. Scott. 2019. Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 15-20, 2019, Long Beach, CA, USA, 5017--5025.
[19]
Rahul Rama, Varior, Bing Shuai, Jiwen Lu, Dong Xu, and Gang Wang. 2016. A Siamese Long Short-Term Memory Architecture for Human Re-identification. In Proceedings of the European Conference on Computer Vision (ECCV), vol 9911.
[20]
Camilo Vargas, Qianni Zhang, and Ebroul Izquierdo. 2020. One Shot Logo Recognition Based on Siamese Neural Networks. In Proceedings of the 2020 International Conference on Multimedia Retrieval (ICMR '20), June, 2020, New York, NY, USA, 321--325.
[21]
Jiaxu Han, Tianyu Zhao, and Changqing Zhang. 2019. Deep Distillation Metric Learning. In Proceedings of the ACM Multimedia Asia (MMAsia '19), December, 2019, New York, NY, USA, Article 11, 1--7.
[22]
Weifeng Ge, Weilin Huang, Dengke Dong, and Matthew R. Scott. 2018. Deep Metric Learning with Hierarchical Triplet Loss. In Proceedings of the European Conference on Computer Vision (ECCV), vol 11210.
[23]
Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. 2017. Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 1320--1329.
[24]
Jeany Son, Mooyeol Baek, Minsu Cho, and Bohyung Han. 2017. Multi-object Tracking with Quadruplet Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 3786--3795.
[25]
Kihyuk Sohn. 2016. Improved deep metric learning with multi-class N-pair loss objective. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS), Dec. 5-10, 2016, Barcelona, Spain, 1857--1865.
[26]
Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, and Yi Ma. 2015. PCANet: A Simple Deep Learning Baseline for Image Classification?. IEEE Transactions on Image Processing 24, 12 (Dec. 2015), 5017--5032.
[27]
Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 15-20, 2019, Long Beach, CA, USA, 4685--4694.
[28]
Kai Li, Zhengming Ding, Kunpeng Li, Yulun Zhang, and Yun Fu. 2018. Support Neighbor Loss for Person Re-Identification. In Proceedings of the 26th ACM international conference on Multimedia (MM '18), October, 2018, New York, NY, USA, 1492--1500.
[29]
Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, Zheng-Jun Zha, and Meng Wang. 2020. Memory-Augmented Relation Network for Few-Shot Learning. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20), October, 2020, New York, NY, USA, 1236--1244.
[30]
Peike Li, Yunchao Wei, and Yi Yang. 2020. Meta Parsing Networks: Towards Generalized Few-shot Scene Parsing with Adaptive Metric Learning. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20), October, 2020, New York, NY, USA, 64--72.
[31]
Shuicheng Yan, Dong Xu, Benyu Zhang, Hong-jiang Zhang, Qiang Yang, and Stephen Lin. 2007. Graph Embedding and Extensions: A General Framework for Dimensionality Reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 1 (Jan. 2007), 40--51.
[32]
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. 1993. Signature verification using a "Siamese" time delay neural network. In Proceedings of the 6th International Conference on Neural Information Processing Systems (NIPS'93), November, 1993, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 737--744.
[33]
Guoqiang Zhong, Kang Zhang, Hongxu Wei, Yuchen Zheng, and Junyu Dong. 2019. Marginal Deep Architecture: Stacking Feature Learning Modules to Build Deep Learning Models. IEEE Access 7 (March 2019), 30220--30233.
[34]
Jun Shi, Jinjie Wu, Yan Li, Qi Zhang, and Shihui Ying. 2017. Histopathological Image Classification With Color Pattern Random Binary Hashing-Based PCANet and Matrix-Form Classifier. IEEE Journal of Biomedical and Health Informatics 21, 5 (Sept. 2017), 1327--1337.
[35]
Cheng-Yaw Low; Andrew Beng-Jin Teoh; Kar-Ann Toh. 2017. Stacking PCANet +: An Overly Simplified ConvNets Baseline for Face Recognition. IEEE Signal Processing Letters 24, 11 (Nov. 2017), 1581--1585.
[36]
Jiasong Wu, Shijie Qiu, Rui Zeng, Youyong Kong, Lotfi Senhadji, and Huazhong Shu. 2017. Multilinear Principal Component Analysis Network for Tensor Object Classification. IEEE Access 5 (March 2017), 3322--3331.
[37]
Zhenyu Huang, Hongyuan Zhu, Joey Tianyi Zhou, and Xi Peng. 2019. Multiple Marginal Fisher Analysis. IEEE Transactions on Industrial Electronics 66, 12 (Dec. 2019), 9798--9807.
[38]
Kuang-Chih Lee, J. Ho, and D.J. Kriegman. 2005. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 5 (May 2005), 684--698.
[39]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv:1708.07747. Retrieved from https://arxiv.org/abs/1708.07747
[40]
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (Nov. 1998), 2278--2324.

Cited By

View all
  • (2023)Task-Aware Dual-Representation Network for Few-Shot Action RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.326267033:10(5932-5946)Online publication date: 28-Mar-2023
  • (2022)Marginal Fisher Analysis With Polynomial Matrix FunctionIEEE Access10.1109/ACCESS.2022.320890110(102451-102461)Online publication date: 2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '21: Proceedings of the 29th ACM International Conference on Multimedia
October 2021
5796 pages
ISBN:9781450386517
DOI:10.1145/3474085
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. binary stochastic hashing
  2. convolution neural network
  3. feature extraction
  4. marginal fisher analysis

Qualifiers

  • Research-article

Funding Sources

Conference

MM '21
Sponsor:
MM '21: ACM Multimedia Conference
October 20 - 24, 2021
Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)3
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Task-Aware Dual-Representation Network for Few-Shot Action RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.326267033:10(5932-5946)Online publication date: 28-Mar-2023
  • (2022)Marginal Fisher Analysis With Polynomial Matrix FunctionIEEE Access10.1109/ACCESS.2022.320890110(102451-102461)Online publication date: 2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media