Less Is More: Accelerating Faster Neural Networks Straight from JPEG

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12702))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

880 Accesses
2 Citations

Abstract

Most image data available are often stored in a compressed format, from which JPEG is the most widespread. To feed this data on a convolutional neural network (CNN), a preliminary decoding process is required to obtain RGB pixels, demanding a high computational load and memory usage. For this reason, the design of CNNs for processing JPEG compressed data has gained attention in recent years. In most existing works, typical CNN architectures are adapted to facilitate the learning with the DCT coefficients rather than RGB pixels. Although they are effective, their architectural changes either raise the computational costs or neglect relevant information from DCT inputs. In this paper, we examine different ways of speeding up CNNs designed for DCT inputs, exploiting learning strategies to reduce the computational complexity by taking full advantage of DCT inputs. Our experiments were conducted on the ImageNet dataset. Results show that learning how to combine all DCT inputs in a data-driven fashion is better than discarding them by hand, and its combination with a reduction of layers has proven to be effective for reducing the computational costs while retaining accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 55.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 69.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training

Convolution without multiplication: A general speed up strategy for CNNs

Article 11 November 2021

On Enhancing Low Bit-Rate Performance of an Image Codec Using Deep Learning-Based Nonlinear Processing

References

Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: CVPR, pp. 1251–1258 (2017)
Google Scholar
Deguerre, B., Chatelain, C., Gasso, G.: Fast object detection in compressed JPEG images. In: IEEE Intelligent Transportation Systems Conference (ITSC’19), pp. 333–338 (2019)
Google Scholar
Delac, K., Grgic, M., Grgic, S.: Face recognition in JPEG and JPEG2000 compressed domain. Image Vision Comput. 27(8), 1108–1120 (2009)
Google Scholar
Ehrlich, M., Davis, L.S.: Deep residual learning in the JPEG transform domain. In: ICCV, pp. 3484–3493 (2019)
Google Scholar
Gueguen, L., Sergeev, A., Kadlec, B., Liu, R., Yosinski, J.: Faster neural networks straight from JPEG. In: NIPS, pp. 3937–3948 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
Li, Y., Gu, S., Gool, L.V., Timofte, R.: Learning filter basis for convolutional neural network compression. In: ICCV, pp. 5623–5632 (2019)
Google Scholar
Lin, M., Chen, Q., Yan, S.: Network in network (2013). arXiv preprint arXiv:1312.4400
Liu, W., et al.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
Google Scholar
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP’15), pp. 1412–1421 (2015)
Google Scholar
Poursistani, P., Nezamabadi-pour, H., Moghadam, R.A., Saeed, M.: Image indexing and retrieval in JPEG compressed domain based on vector quantization. Math. Comput. Modell. 57(5–6), 1005–1017 (2013)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Google Scholar
Santos, S.F., Almeida, J.: Faster and accurate compressed video action recognition straight from the frequency domain. In: SIBGRAPI - Conference on Graphics, Patterns and Images (SIBGRAPI’20), pp. 62–68 (2020)
Google Scholar
Santos, S.F., Sebe, N., Almeida, J.: CV-C3D: action recognition on compressed videos with convolutional 3d networks. In: SIBGRAPI - Conference on Graphics, Patterns and Images (SIBGRAPI’19), pp. 24–30 (2019)
Google Scholar
Santos, S.F., Sebe, N., Almeida, J.: The good, the bad, and the ugly: neural networks straight from jpeg. In: ICIP, pp. 1896–1900 (2020)
Google Scholar
Tan, M., Le, Q.V.: Efficientnet: rethinking model scaling for convolutional neural networks. In: ICML, pp. 6105–6114 (2019)
Google Scholar

Download references

Acknowledgment

This research was supported by the FAPESP-Microsoft Research Virtual Institute (grant 2017/25908-6) and the Brazilian National Council for Scientific and Technological Development - CNPq (grant 314868/2020-8).

Author information

Authors and Affiliations

Instituto de Ciência e Tecnologia, Universidade Federal de São Paulo – UNIFESP, São José dos Campos, SP, 12247-014, Brazil
Samuel Felipe dos Santos & Jurandy Almeida

Authors

Samuel Felipe dos Santos
View author publications
You can also search for this author in PubMed Google Scholar
Jurandy Almeida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samuel Felipe dos Santos .

Editor information

Editors and Affiliations

Universidade do Porto, Porto, Portugal
João Manuel R. S. Tavares
Universidade Estadual Paulista, São Paulo, Brazil
João Paulo Papa
University of the Balearic Islands, Palma de Mallorca, Spain
Manuel González Hidalgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

dos Santos, S.F., Almeida, J. (2021). Less Is More: Accelerating Faster Neural Networks Straight from JPEG. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-93420-0_23
Published: 13 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Less Is More: Accelerating Faster Neural Networks Straight from JPEG

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training

Convolution without multiplication: A general speed up strategy for CNNs

On Enhancing Low Bit-Rate Performance of an Image Codec Using Deep Learning-Based Nonlinear Processing

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Less Is More: Accelerating Faster Neural Networks Straight from JPEG

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training

Convolution without multiplication: A general speed up strategy for CNNs

On Enhancing Low Bit-Rate Performance of an Image Codec Using Deep Learning-Based Nonlinear Processing

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation