More Web Proxy on the site http://driver.im/

short-paper

Scalable Compression of Deep Neural Networks

Authors:

Jie LiangAuthors Info & Claims

MM '16: Proceedings of the 24th ACM international conference on Multimedia

Pages 511 - 515

https://doi.org/10.1145/2964284.2967273

Published: 01 October 2016 Publication History

Abstract

Deep neural networks generally involve some layers with millions of parameters, making them difficult to be deployed and updated on devices with limited resources such as mobile phones and other smart embedded systems. In this paper, we propose a scalable representation of the network parameters, so that different applications can select the most suitable bit rate of the network based on their own storage constraints. Moreover, when a device needs to upgrade to a high-rate network, the existing low-rate network can be reused, and only some incremental data are needed to be downloaded. We first hierarchically quantize the weights of a pre-trained deep neural network to enforce weight sharing. Next, we adaptively select the bits assigned to each layer given the total bit budget. After that, we retrain the network to fine-tune the quantized centroids. Experimental results show that our method can achieve scalable compression with graceful degradation in the performance.

References

[1]

J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13:281--305, Feb. 2012.

[2]

W. Chen, J. Wilson, S. Tyree, K. Q. Weinberger, and Y. Chen. Compressing neural networks with the hashing trick. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 2285--2294. JMLR Workshop and Conference Proceedings, 2015.

[3]

E. L. Denton, W. Zaremba, J. Bruna, Y. Lecun, and R. Fergus. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems 27, pages 1269--1277. Curran Associates, Inc., 2014.

Digital Library

[4]

Y. Gong, L. Liu, M. Yang, and L. Bourdev. Compressing deep convolutional networks using vector quantization. In arXiv, page arXiv:1412.6115, 2014.

[5]

S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In International Conference on Learning Representations (ICLR), 2016.

[6]

S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems 28, pages 1135--1143. 2015.

Digital Library

[7]

Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin. Compression of deep convolutional neural networks for fast and low power mobile applications. In International Conference on Learning Representations (ICLR), 2016.

[8]

A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto, 2009.

[9]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1097--1105, 2012.

Digital Library

[10]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.

[11]

D. D. Lin, S. S. Talathi, and V. S. Annapureddy. Fixed point quantization of deep convolutional networks. In arXiv, page arXiv:1511.06393, 2015.

[12]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211--252, 2015.

Digital Library

[13]

H. Schwarz, D. Marpe, and T. Wiegand. Overview of the scalable video coding extension of the h.264/avc standard. IEEE Trans. Circ. Syst. Video Tech., 17(9):1103--1120, 2007.

Digital Library

[14]

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In arXiv, page arXiv:1409.1556, 2014.

[15]

G. J. Sullivan, J. M. Boyce, Y. Chen, J.-R. Ohm, C. A. Segall, and A. Vetro. Standardized extensions of high efficiency video coding. IEEE Journal on Selected Topics in Signal Processing, 7(6):1001--1016, Dec. 2013.

[16]

D. Taubman and M. Marcellin. JPEG 2000: image compression fundamentals, standards, and practice. Kluwer Academic Publishers, Boston, MA, 2002.

Digital Library

[17]

A. Vedaldi and K. Lenc. Matconvnet -- convolutional neural networks for matlab. In Proceeding of the ACM Int. Conf. on Multimedia, 2015.

Digital Library

Cited By

Chen YZhou RGuo BShen YWang WWen XSuo X(2022)Discrete cosine transform for filter pruningApplied Intelligence10.1007/s10489-022-03604-253:3(3398-3414)Online publication date: 30-May-2022
https://doi.org/10.1007/s10489-022-03604-2

Index Terms

Scalable Compression of Deep Neural Networks
1. Computing methodologies
  1. Artificial intelligence
    1. Search methodologies
      1. Discrete space search
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Clustering and classification

Recommendations

Scalable compression based on tree structured vector quantization of perceptually weighted block, lapped, and wavelet transforms
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3

This paper presents an algorithm for scalable compression using tree structured vector quantization (TSVQ) of perceptually weighted block, lapped, or wavelet transforms. The algorithm produces an embedded bit-stream to support decoders with various ...
Deep neural network based single pixel prediction for unified video coding

Classical video prediction methods exploit directly and shallowly the intra-frame, inter-frame and multi-view similarities within the video sequences; the proposed video prediction methods indirectly and intensively transform the frame correlations into ...
Lossless Text Compression Using Recurrent Neural Networks
Abstract
Lossless Data compression is the process of reducing the size or the number of bits required to represent data, and Arithmetic coding is one of the popular lossless text compression techniques. This project focuses on lossless data compression ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '16: Proceedings of the 24th ACM international conference on Multimedia

October 2016

1542 pages

ISBN:9781450336031

DOI:10.1145/2964284

General Chairs:
Alan Hanjalic
Delft University of Technology
,
Cees Snoek
Qualcomm Research Netherlands / University of Amsterdam
,
Marcel Worring
University of Amsterdam
,
Moderator:
Dick Bulterman
CWI / VU University Amsterdam
,
Program Chairs:
Benoit Huet
EURECOM
,
Aisling Kelliher
Virginia Tech
,
Yiannis Kompatsiaris
CERTH-ITI
,
Jin Li
Microsoft

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Conference

MM '16

Sponsor:

SIGMM

MM '16: ACM Multimedia Conference

October 15 - 19, 2016

Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
335
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen YZhou RGuo BShen YWang WWen XSuo X(2022)Discrete cosine transform for filter pruningApplied Intelligence10.1007/s10489-022-03604-253:3(3398-3414)Online publication date: 30-May-2022
https://doi.org/10.1007/s10489-022-03604-2

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents