[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
announcement
Open access

Deep Learning at Scale and at Ease

Published: 02 November 2016 Publication History

Abstract

Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multimodal data analysis. Large deep learning models are developed for learning rich representations of complex data. There are two challenges to overcome before deep learning can be widely adopted in multimedia and other applications. One is usability, namely the implementation of different models and training algorithms must be done by nonexperts without much effort, especially when the model is large and complex. The other is scalability, namely the deep learning system must be able to provision for a huge demand of computing resources for training large models with massive datasets. To address these two challenges, in this article we design a distributed deep learning platform called SINGA, which has an intuitive programming model based on the common layer abstraction of deep learning models. Good scalability is achieved through flexible distributed training architecture and specific optimization techniques. SINGA runs on both GPUs and CPUs, and we show that it outperforms many other state-of-the-art deep learning systems. Our experience with developing and training deep learning models for real-life multimedia applications in SINGA shows that the platform is both usable and scalable.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado et al. 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. arXiv:1603.04467. http://tensorflow.org/.
[2]
Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. 2012. Theano: New features and speed improvements. In Proceedings of the Deep Learning Workshop (NIPS’12).
[3]
Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. 2016b. Training deep nets with sublinear memory cost. arXiv:1604.06174. http://arxiv.org/abs/1604.06174
[4]
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274.
[5]
Jianmin Chen, Rajat Monga, Samy Bengio, and Rafal Józefowicz. 2016a. Revisiting distributed synchronous SGD. arXiv:1604.00981. http://arxiv.org/abs/1604.00981
[6]
Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. 2014. Project Adam: Building an efficient and scalable deep learning training system. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). 571--582. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/chilimbi.
[7]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yan-Tao. Zheng. 2009. NUS-WIDE: A real-world Web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR’09). Article No. 48.
[8]
Dan Claudiu Ciresan, Ueli Meier, Luca Maria Gambardella, and Jürgen Schmidhuber. 2010. Deep big simple neural nets excel on handwritten digit recognition. arXiv:1003.0358.
[9]
Adam Coates, Brody Huval, Tao Wang, David J. Wu, Bryan C. Catanzaro, and Andrew Y. Ng. 2013. Deep learning with COTS HPC systems. In Proceedings of the 30th International Conference on Machine Learning (ICML’13). 1337--1345.
[10]
R. Collobert, K. Kavukcuoglu, and C. Farabet. 2011. Torch7: A Matlab-like environment for machine learning. In Proceedings of the BigLearn Workshop (NIPS’11).
[11]
Wei Dai, Jinliang Wei, Xun Zheng, Jin Kyu Kim, Seunghak Lee, Junming Yin, Qirong Ho, and Eric P. Xing. 2013. Petuum: A framework for iterative-convergent distributed ML. arXiv:1312.7651. http://arxiv.org/abs/1312.7651
[12]
Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew W. Senior, Paul A. Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems (NIPS’12). 1232--1240.
[13]
John C. Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, 2121--2159. http://dl.acm.org/citation.cfm?id=2021068
[14]
Fangxiang Feng, Xiaojie Wang, and Ruifan Li. 2014. Cross-modal retrieval with correspondence autoencoder. In Proceedings of the 22nd ACM International Conference on Multimedia (MM’14). 7--16.
[15]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. arXiv:1512.03385.
[16]
Geoffrey Hinton and Ruslan Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786, 504--507.
[17]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093.
[18]
Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, and Sai Wu. 2014. epiC: An extensible and scalable system for processing big data. Proceedings of the VLDB Endowment 7, 7, 541--552. http://www.vldb.org/pvldb/vol7/p541-jiang.pdf.
[19]
Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv:1404.5997.
[20]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 1106--1114.
[21]
Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Greg Corrado, Kai Chen, Jeffrey Dean, and Andrew Y. Ng. 2012. Building high-level features using large scale unsupervised learning. In Proceedings of the International Conference on Machine Learning (ICML’12).
[22]
Yann LeCun, Léon Bottou, Genevieve B. Orr, and Klaus-Robert Müller. 1996. Efficient BackProp. In Neural Networks: Tricks of the Trade. Springer, 9--50.
[23]
Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 583--598. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/li_mu.
[24]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (NIPS’13). 3111--3119.
[25]
Tomas Mikolov, Stefan Kombrink, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2011. Extensions of recurrent neural network language model. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11). IEEE, Los Alamitos, CA, 5528--5531.
[26]
Beng Chin Ooi, Kian-Lee Tan, Sheng Wang, Wei Wang, Qingchao Cai, Gang Chen, Jinyang Gao et al. 2015. SINGA: A distributed deep learning platform. In Proceedings of the ACM Multimedia Conference.
[27]
Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, and Thomas S. Huang. 2013. GPU asynchronous stochastic gradient descent to speed up neural network training. arXiv:1312.6186.
[28]
Benjamin Recht, Christopher Re, Stephen J. Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems (NIPS’11). 693--701.
[29]
Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 2014. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH’14). 1058--1062.
[30]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. http://arxiv.org/abs/1409.1556
[31]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2014. Going deeper with convolutions. arXiv:1409.4842.
[32]
Heng Tao Shen, Beng Chin Ooi, and Kian-Lee Tan. 2000. Giving meanings to WWW images. In Proceedings of the ACM Multimedia Conference. 39--47.
[33]
Kian-Lee Tan, Qingchao Cai, Beng Chin Ooi, Weng-Fai Wong, Chang Yao, and Hao Zhang. 2015. In-memory databases: Challenges and opportunities from software and hardware perspectives. ACM SIGMOD Record 44, 2, 35--40.
[34]
Ji Wan, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, and Jintao Li. 2014. Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the ACM Multimedia Conference. 157--166.
[35]
Xinxi Wang and Ye Wang. 2014. Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the ACM Multimedia Conference. 627--636.
[36]
Wei Wang, Beng Chin Ooi, Xiaoyan Yang, Dongxiang Zhang, and Yueting Zhuang. 2014. Effective multi-modal retrieval based on stacked auto-encoders. Proceedings of the VLDB Endowment 7, 8, 649--660.
[37]
Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, and Sheng Wang. 2015a. SINGA: Putting deep learning in the hands of multimedia users. In Proceedings of the ACM Multimedia Conference.
[38]
Wei Wang, Xiaoyan Yang, Beng Chin Ooi, Dongxiang Zhang, and Yueting Zhuang. 2015b. Effective deep learning-based multi-modal retrieval. VLDB Journal 25, 1, 79--101.
[39]
Ren Wu, Shengen Yan, Yi Shan, Qingqing Dang, and Gang Sun. 2015. Deep Image: Scaling up image recognition. arXiv:1501.02876. http://arxiv.org/abs/1501.02876
[40]
Zuxuan Wu, Yu-Gang Jiang, Jun Wang, Jian Pu, and Xiangyang Xue. 2014. Exploring inter-feature and inter-class relationships with deep neural networks for video classification. In Proceedings of the ACM Multimedia Conference. 167--176.
[41]
Omry Yadan, Keith Adams, Yaniv Taigman, and Marc’Aurelio Ranzato. 2013. Multi-GPU training of ConvNets. arXiv:1312.5853.
[42]
Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang. 2015. Joint visual-textual sentiment analysis with deep neural networks. In Proceedings of the ACM Multimedia Conference. 1071--1074.
[43]
Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Oleksii Kuchaiev, Yu Zhang, Frank Seide et al. 2014. An Introduction to Computational Networks and the Computational Network Toolkit. Microsoft Technical Report MSR-TR-2014-112. Microsoft Research.
[44]
Ce Zhang and Christopher Re. 2014. DimmWitted: A study of main-memory statistical analytics. Proceedings of the VLDB Endowment 7, 12, 1283--1294. http://www.vldb.org/pvldb/vol7/p1283-zhang.pdf.
[45]
Hanwang Zhang, Yang Yang, Huan-Bo Luan, Shuicheng Yang, and Tat-Seng Chua. 2014. Start from scratch: Towards automatically identifying, modeling, and naming visual attributes. In Proceedings of the ACM Multimedia Conference. 187--196.

Cited By

View all
  • (2024)SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAIDIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.344308335:10(1841-1853)Online publication date: 14-Aug-2024
  • (2024)Resource Efficient Bayesian Optimization2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00012(12-19)Online publication date: 7-Jul-2024
  • (2023)Video File Allocation for Wear-Leveling in Distributed Storage Systems With Heterogeneous Solid-State-Disks (SSDs)IEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.322247333:5(2477-2490)Online publication date: 1-May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 12, Issue 4s
Special Section on Trust Management for Multimedia Big Data and Special Section on Best Papers of ACM Multimedia 2015
November 2016
242 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/2997658
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2016
Accepted: 01 August 2016
Revised: 01 June 2016
Received: 01 February 2016
Published in TOMM Volume 12, Issue 4s

Check for updates

Author Tags

  1. Multimedia
  2. deep learning
  3. distributed training

Qualifiers

  • Announcement
  • Research
  • Refereed

Funding Sources

  • A*STAR
  • National Natural Science Foundation of China
  • National Research Foundation, Prime Minister's Office, Singapore under its Competitive Research Programme
  • National Research Foundation, Energy Innovation Programme Office, Singapore under Energy Innovation Research Programme

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)125
  • Downloads (Last 6 weeks)15
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAIDIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.344308335:10(1841-1853)Online publication date: 14-Aug-2024
  • (2024)Resource Efficient Bayesian Optimization2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00012(12-19)Online publication date: 7-Jul-2024
  • (2023)Video File Allocation for Wear-Leveling in Distributed Storage Systems With Heterogeneous Solid-State-Disks (SSDs)IEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.322247333:5(2477-2490)Online publication date: 1-May-2023
  • (2023)Low-complexity CNN-based CU partitioning for intra framesJournal of Real-Time Image Processing10.1007/s11554-023-01328-120:4Online publication date: 14-Jun-2023
  • (2022)Agile Support Vector Machine for Energy-efficient Resource Allocation in IoT-oriented Cloud using PSOACM Transactions on Internet Technology10.1145/343354122:1(1-35)Online publication date: 28-Feb-2022
  • (2022)Content-based coding tree unit level rate-quantization model for intra-coding in high efficiency video coding standard using convolutional neural networkJournal of Electronic Imaging10.1117/1.JEI.31.3.03302631:03Online publication date: 1-May-2022
  • (2022)Scalability of knowledge distillation in incremental deep learning for fast object detection▪Applied Soft Computing10.1016/j.asoc.2022.109608129:COnline publication date: 1-Nov-2022
  • (2022)Detection of Lung Malignancy Using SqueezeNet-Fc Deep Learning Classification TechniqueProceedings of the International Conference on Paradigms of Communication, Computing and Data Sciences10.1007/978-981-16-5747-4_59(683-699)Online publication date: 1-Jan-2022
  • (2020)A Fast Robustness Quantification Method for Evaluating Typical Deep Learning Models by Generally Image Processing2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00040(110-117)Online publication date: Dec-2020
  • (2020)SoftMemoryBox II: A Scalable, Shared Memory Buffer Framework for Accelerating Distributed Training of Large-Scale Deep Neural NetworksIEEE Access10.1109/ACCESS.2020.30381128(207097-207111)Online publication date: 2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media