[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Less Is More: Accelerating Faster Neural Networks Straight from JPEG

  • Conference paper
  • First Online:
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (CIARP 2021)

Abstract

Most image data available are often stored in a compressed format, from which JPEG is the most widespread. To feed this data on a convolutional neural network (CNN), a preliminary decoding process is required to obtain RGB pixels, demanding a high computational load and memory usage. For this reason, the design of CNNs for processing JPEG compressed data has gained attention in recent years. In most existing works, typical CNN architectures are adapted to facilitate the learning with the DCT coefficients rather than RGB pixels. Although they are effective, their architectural changes either raise the computational costs or neglect relevant information from DCT inputs. In this paper, we examine different ways of speeding up CNNs designed for DCT inputs, exploiting learning strategies to reduce the computational complexity by taking full advantage of DCT inputs. Our experiments were conducted on the ImageNet dataset. Results show that learning how to combine all DCT inputs in a data-driven fashion is better than discarding them by hand, and its combination with a reduction of layers has proven to be effective for reducing the computational costs while retaining accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 55.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 69.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: CVPR, pp. 1251–1258 (2017)

    Google Scholar 

  2. Deguerre, B., Chatelain, C., Gasso, G.: Fast object detection in compressed JPEG images. In: IEEE Intelligent Transportation Systems Conference (ITSC’19), pp. 333–338 (2019)

    Google Scholar 

  3. Delac, K., Grgic, M., Grgic, S.: Face recognition in JPEG and JPEG2000 compressed domain. Image Vision Comput. 27(8), 1108–1120 (2009)

    Google Scholar 

  4. Ehrlich, M., Davis, L.S.: Deep residual learning in the JPEG transform domain. In: ICCV, pp. 3484–3493 (2019)

    Google Scholar 

  5. Gueguen, L., Sergeev, A., Kadlec, B., Liu, R., Yosinski, J.: Faster neural networks straight from JPEG. In: NIPS, pp. 3937–3948 (2018)

    Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  7. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861

  8. Li, Y., Gu, S., Gool, L.V., Timofte, R.: Learning filter basis for convolutional neural network compression. In: ICCV, pp. 5623–5632 (2019)

    Google Scholar 

  9. Lin, M., Chen, Q., Yan, S.: Network in network (2013). arXiv preprint arXiv:1312.4400

  10. Liu, W., et al.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)

    Google Scholar 

  11. Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP’15), pp. 1412–1421 (2015)

    Google Scholar 

  12. Poursistani, P., Nezamabadi-pour, H., Moghadam, R.A., Saeed, M.: Image indexing and retrieval in JPEG compressed domain based on vector quantization. Math. Comput. Modell. 57(5–6), 1005–1017 (2013)

    Google Scholar 

  13. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Google Scholar 

  14. Santos, S.F., Almeida, J.: Faster and accurate compressed video action recognition straight from the frequency domain. In: SIBGRAPI - Conference on Graphics, Patterns and Images (SIBGRAPI’20), pp. 62–68 (2020)

    Google Scholar 

  15. Santos, S.F., Sebe, N., Almeida, J.: CV-C3D: action recognition on compressed videos with convolutional 3d networks. In: SIBGRAPI - Conference on Graphics, Patterns and Images (SIBGRAPI’19), pp. 24–30 (2019)

    Google Scholar 

  16. Santos, S.F., Sebe, N., Almeida, J.: The good, the bad, and the ugly: neural networks straight from jpeg. In: ICIP, pp. 1896–1900 (2020)

    Google Scholar 

  17. Tan, M., Le, Q.V.: Efficientnet: rethinking model scaling for convolutional neural networks. In: ICML, pp. 6105–6114 (2019)

    Google Scholar 

Download references

Acknowledgment

This research was supported by the FAPESP-Microsoft Research Virtual Institute (grant 2017/25908-6) and the Brazilian National Council for Scientific and Technological Development - CNPq (grant 314868/2020-8).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel Felipe dos Santos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

dos Santos, S.F., Almeida, J. (2021). Less Is More: Accelerating Faster Neural Networks Straight from JPEG. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93420-0_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93419-4

  • Online ISBN: 978-3-030-93420-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics