More Web Proxy on the site http://driver.im/

research-article

Continuous ODE-defined Image Features for Adaptive Retrieval

Authors:

Giuseppe Amato,

Fabrizio Falchi,

Claudio GennaroAuthors Info & Claims

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

Pages 198 - 206

https://doi.org/10.1145/3372278.3390690

Published: 08 June 2020 Publication History

Abstract

In the last years, content-based image retrieval largely benefited from representation extracted from deeper and more complex convolutional neural networks, which became more effective but also more computationally demanding. Despite existing hardware acceleration, query processing times may be easily saturated by deep feature extraction in high-throughput or real-time embedded scenarios, and usually, a trade-off between efficiency and effectiveness has to be accepted. In this work, we experiment with the recently proposed continuous neural networks defined by parametric ordinary differential equations, dubbed ODE-Nets, for adaptive extraction of image representations. Given the continuous evolution of the network hidden state, we propose to approximate the exact feature extraction by taking a previous "near-in-time" hidden state as features with a reduced computational cost. To understand the potential and the limits of this approach, we also evaluate an ODE-only architecture in which we minimize the number of classical layers in order to delegate most of the representation learning process --- and thus the feature extraction process --- to the continuous part of the model. Preliminary experiments on standard benchmarks show that we are able to dynamically control the trade-off between efficiency and effectiveness of feature extraction at inference-time by controlling the evolution of the continuous hidden state. Although ODE-only networks provide the best fine-grained control on the effectiveness-efficiency trade-off, we observed that mixed architectures perform better or comparably to standard residual nets in both the image classification and retrieval setups while using fewer parameters and retaining the controllability of the trade-off.

References

[1]

Giuseppe Amato, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, and Lucia Vadicamo. 2019. Large-scale instance-level image retrieval. Information Processing & Management (2019), 102100.

[2]

Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro, and Lucia Vadicamo. 2016. Deep permutations: deep convolutional neural networks and permutation-based indexing. In International Conference on Similarity Search and Applications. Springer, 93--106.

[3]

Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5297--5307.

[4]

Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In European conference on computer vision. Springer, 584--599.

[5]

Fatih Cakir, Kun He, Sarah Adel Bargal, and Stan Sclaroff. 2017. MIHash: Online hashing with mutual information. In Proceedings of the IEEE International Conference on Computer Vision. 437--445.

[6]

Fabio Carrara, Giuseppe Amato, Fabrizio Falchi, and Claudio Gennaro. 2019. Evaluation of continuous image features learned by ode nets. In International Conference on Image Analysis and Processing. Springer, 432--442.

Digital Library

[7]

Bo Chang, Lili Meng, Eldad Haber, Lars Ruthotto, David Begert, and Elliot Holtham. 2018a. Reversible architectures for arbitrarily deep residual neural networks. In Thirty-Second AAAI Conference on Artificial Intelligence .

[8]

Bo Chang, Lili Meng, Eldad Haber, Frederick Tung, and David Begert. 2018b. Multi-level Residual Networks from Dynamical Systems View. In International Conference on Learning Representations. https://openreview.net/forum?id=SyJS-OgR-

[9]

Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. Neural ordinary differential equations. In Advances in Neural Information Processing Systems. 6572--6583.

[10]

John R Dormand and Peter J Prince. 1980. A family of embedded Runge-Kutta formulae. Journal of computational and applied mathematics, Vol. 6, 1 (1980), 19--26.

[11]

Matthijs Douze, Hervé Jégou, and Florent Perronnin. 2016. Polysemous codes. In European Conference on Computer Vision. Springer, 785--801.

[12]

Emilien Dupont, Arnaud Doucet, and Yee Whye Teh. 2019. Augmented Neural ODEs. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8--14 December 2019, Vancouver, BC, Canada. 3134--3144. http://papers.nips.cc/paper/8577-augmented-neural-odes

[13]

Albert Gordo, Jon Almazan, Jerome Revaud, and Diane Larlus. 2017. End-to-end learning of deep visual representations for image retrieval. International Journal of Computer Vision, Vol. 124, 2 (2017), 237--254.

Digital Library

[14]

Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. 2019. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models. International Conference on Learning Representations (2019).

[15]

Eldad Haber and Lars Ruthotto. 2017. Stable architectures for deep neural networks. Inverse Problems, Vol. 34, 1 (2017), 014004.

[16]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016a. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Identity mappings in deep residual networks. In European conference on computer vision. Springer, 630--645.

[18]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.

[19]

Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming embedding and weak geometric consistency for large scale image search. In European conference on computer vision. Springer, 304--317.

Digital Library

[20]

Yannis Kalantidis, Clayton Mellina, and Simon Osindero. 2016. Cross-dimensional weighting for aggregated deep convolutional features. In European conference on computer vision. Springer, 685--701.

[21]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.

[22]

Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2017. FractalNet: Ultra-Deep Neural Networks without Residuals. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=S1VaB4cex

[23]

Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (1998), 2278--2324.

[24]

Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2074--2081.

Digital Library

[25]

Yiping Lu, Aoxiao Zhong, Quanzheng Li, and Bin Dong. 2018. Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsm"a ssan, Stockholm, Sweden, July 10--15, 2018. 3282--3291. http://proceedings.mlr.press/v80/lu18d.html

[26]

Yury A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence (2018).

[27]

Nicola Messina, Giuseppe Amato, Fabio Carrara, Fabrizio Falchi, and Claudio Gennaro. 2018. Learning relationship-aware visual features. In Proceedings of the European Conference on Computer Vision (ECCV). 0--0.

[28]

James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2007. Object retrieval with large vocabularies and fast spatial matching. In 2007 IEEE conference on computer vision and pattern recognition. IEEE, 1--8.

[29]

Filip Radenović, Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondvr ej Chum. 2018a. Revisiting oxford and paris: Large-scale image retrieval benchmarking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5706--5715.

[30]

Filip Radenović, Giorgos Tolias, and Ondrej Chum. 2018b. Fine-tuning CNN image retrieval with no human annotation. IEEE transactions on pattern analysis and machine intelligence (2018).

[31]

Ali S Razavian, Josephine Sullivan, Stefan Carlsson, and Atsuto Maki. 2016. Visual instance retrieval with deep convolutional networks. ITE Transactions on Media Technology and Applications, Vol. 4, 3 (2016), 251--258.

[32]

Yulia Rubanova, Ricky TQ Chen, and David Duvenaud. 2019. Latent odes for irregularly-sampled time series. arXiv preprint arXiv:1907.03907 (2019).

[33]

Lars Ruthotto and Eldad Haber. 2019. Deep Neural Networks Motivated by Partial Differential Equations. Journal of Mathematical Imaging and Vision (18 Sep 2019). https://doi.org/10.1007/s10851-019-00903--1

[34]

Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806--813.

Digital Library

[35]

Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2015. Particular object retrieval with integral max-pooling of CNN activations. arXiv preprint arXiv:1511.05879 (2015).

[36]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

[37]

Yuxin Wu and Kaiming He. 2018. Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV). 3--19.

Digital Library

[38]

Xingcheng Zhang, Zhizhong Li, Chen Change Loy, and Dahua Lin. 2017. Polynet: A pursuit of structural diversity in very deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 718--726.

[39]

Mai Zhu, Bo Chang, and Chong Fu. 2018. Convolutional Neural Networks combined with Runge-Kutta Methods. arXiv preprint arXiv:1802.08831 (2018).

Cited By

Yin SYang XLu RDeng ZYang Y(2024)Visual Attention and ODE-inspired Fusion Network for image dehazingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107692130(107692)Online publication date: Apr-2024
https://doi.org/10.1016/j.engappai.2023.107692
Zhu SQi QZhuang ZWang JSun HLiao JOria VSapino MSatoh SKerhervé BCheng WIde ISingh V(2022)FedNKD: A Dependable Federated Learning Using Fine-tuned Random Noise and Knowledge DistillationProceedings of the 2022 International Conference on Multimedia Retrieval10.1145/3512527.3531372(185-193)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3512527.3531372

Index Terms

Continuous ODE-defined Image Features for Adaptive Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Mathematics of computing
  1. Mathematical analysis
    1. Differential equations
      1. Ordinary differential equations

Recommendations

Based on Image Salient Features Network Image Retrieval Method
CIS '12: Proceedings of the 2012 Eighth International Conference on Computational Intelligence and Security

Traditional image retrieval depends on the images embedded in text messages, text description of the limitations of image content, resulting in low quality of image retrieval. The local information extracted image itself, the use of local features LSH ...
Image Retrieval Using Fused Deep Convolutional Features

This paper proposes an image retrieval using fused deep convolutional features to solve the semantic gap between low-level features and high-level semantic features of traditional contend-based image retrieval method. Firstly, the improved network ...
Self-adaptive Feature Extraction Scheme for Mobile Image Retrieval of Flowers
SITIS '12: Proceedings of the 2012 Eighth International Conference on Signal Image Technology and Internet Based Systems

This paper proposes a new self-adaptive feature extraction scheme to improve retrieval precision for Content-based Image Retrieval (CBIR) systems on mobile phones such that users can search similar pictures for a query image taken from their mobile ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

June 2020

605 pages

ISBN:9781450370875

DOI:10.1145/3372278

General Chairs:
Cathal Gurrin
Dublin City University, Ireland
,
Björn Þór Jónsson
IT University of Copenhagen, Denmark
,
Noriko Kando
National Institute of Informatics, Tokyo
,
Program Chairs:
Klaus Schoeffmann
Klagenfurt University, Austria
,
Phoebe Chen
La Trobe University, Australia
,
Noel E. O'Connor
Dublin City University, Ireland

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

ICMR '20

Sponsor:

SIGMM

ICMR '20: International Conference on Multimedia Retrieval

June 8 - 11, 2020

Dublin, Ireland

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
174
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yin SYang XLu RDeng ZYang Y(2024)Visual Attention and ODE-inspired Fusion Network for image dehazingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107692130(107692)Online publication date: Apr-2024
https://doi.org/10.1016/j.engappai.2023.107692
Zhu SQi QZhuang ZWang JSun HLiao JOria VSapino MSatoh SKerhervé BCheng WIde ISingh V(2022)FedNKD: A Dependable Federated Learning Using Fine-tuned Random Noise and Knowledge DistillationProceedings of the 2022 International Conference on Multimedia Retrieval10.1145/3512527.3531372(185-193)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3512527.3531372

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents