[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

DimmWitted: a study of main-memory statistical analytics

Published: 01 August 2014 Publication History

Abstract

We perform the first study of the tradeoff space of access methods and replication to support statistical analytics using first-order methods executed in the main memory of a Non-Uniform Memory Access (NUMA) machine. Statistical analytics systems differ from conventional SQL-analytics in the amount and types of memory incoherence that they can tolerate. Our goal is to understand tradeoffs in accessing the data in row- or column-order and at what granularity one should share the model and data for a statistical task. We study this new tradeoff space and discover that there are tradeoffs between hardware and statistical efficiency. We argue that our tradeoff study may provide valuable information for designers of analytics engines: for each system we consider, our prototype engine can run at least one popular task at least 100× faster. We conduct our study across five architectures using popular models, including SVMs, logistic regression, Gibbs sampling, and neural networks.

References

[1]
A. Agarwal, O. Chapelle, M. Dudík, and J. Langford. A reliable effective terascale linear learning system. ArXiv e-prints, 2011.
[2]
M.-C. Albutiu, A. Kemper, and T. Neumann. Massively parallel sort-merge joins in main memory multi-core database systems. PVLDB, pages 1064--1075, 2012.
[3]
J. M. Anderson and M. S. Lam. Global optimizations for parallelism and locality on scalable parallel machines. In PLDI, pages 112--125, 1993.
[4]
C. Balkesen and et al. Multi-core, main-memory joins: Sort vs. hash revisited. PVLDB, pages 85--96, 2013.
[5]
M. Bauer, S. Treichler, E. Slaughter, and A. Aiken. Legion: expressing locality and independence with logical regions. In SC, page 66, 2012.
[6]
L. Bergstrom. Measuring NUMA effects with the STREAM benchmark. ArXiv e-prints, 2011.
[7]
S. Carr, K. S. McKinley, and C.-W. Tseng. Compiler optimizations for improving data locality. In ASPLOS, 1994.
[8]
H. Chafi, A. K. Sujeeth, K. J. Brown, H. Lee, A. R. Atreya, and K. Olukotun. A domain-specific approach to heterogeneous parallelism. In PPOPP, pages 35--46, 2011.
[9]
C. Chasseur and J. M. Patel. Design and evaluation of storage organizations for read-optimized main memory databases. PVLDB, pages 1474--1485, 2013.
[10]
J. Dean and et al. Large scale distributed deep networks. In NIPS, pages 1232--1240, 2012.
[11]
A. Ghoting and et al. Cache-conscious frequent pattern mining on modern and emerging processors. VLDBJ, 2007.
[12]
A. Ghoting and et al. SystemML: Declarative machine learning on MapReduce. In ICDE, pages 231--242, 2011.
[13]
J. M. Hellerstein and et al. The MADlib analytics library: Or MAD skills, the SQL. PVLDB, pages 1700--1711, 2012.
[14]
R. Jin, G. Yang, and G. Agrawal. Shared memory parallelization of data mining algorithms: Techniques, programming interface, and performance. TKDE, 2005.
[15]
M. J. Johnson, J. Saunderson, and A. S. Willsky. Analyzing Hogwild parallel Gaussian Gibbs sampling. In NIPS, 2013.
[16]
C. Kim and et al. Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs. PVLDB, 2009.
[17]
A. Kyrola, G. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a pc. In OSDI, pages 31--46, 2012.
[18]
Q. V. Le and et al. Building high-level features using large scale unsupervised learning. In ICML, pages 8595--8598, 2012.
[19]
Y. LeCun and et al. Gradient-based learning applied to document recognition. IEEE, pages 2278--2324, 1998.
[20]
Y. Li and et al. NUMA-aware algorithms: the case of data shuffling. In CIDR, 2013.
[21]
Y. Low and et al. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, pages 716--727, 2012.
[22]
T. M. Mitchell. Machine Learning. McGraw-Hill, USA, 1997.
[23]
D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In SOSP, 2013.
[24]
D. Nguyen, A. Lenharth, and K. Pingali. Deterministic Galois: On-demand, portable and parameterless. In ASPLOS, 2014.
[25]
F. Niu and et al. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In NIPS, pages 693--701, 2011.
[26]
S. Parthasarathy, M. J. Zaki, M. Ogihara, and W. Li. Parallel data mining for association rules on shared memory systems. Knowl. Inf. Syst., pages 1--29, 2001.
[27]
L. Qiao and et al. Main-memory scan sharing for multi-core CPUs. PVLDB, pages 610--621, 2008.
[28]
V. Raman and et al. DB2 with BLU acceleration: So much more than just a column store. PVLDB, pages 1080--1091, 2013.
[29]
P. Richtárik and M. Takáĉ. Parallel coordinate descent methods for big data optimization. ArXiv e-prints, 2012.
[30]
C. P. Robert and G. Casella. Monte Carlo Statistical Methods (Springer Texts in Statistics). Springer, USA, 2005.
[31]
A. Smola and S. Narayanamurthy. An architecture for parallel topic models. PVLDB, pages 703--710, 2010.
[32]
S. Sonnenburg and et al. The SHOGUN machine learning toolbox. J. Mach. Learn. Res., pages 1799--1802, 2010.
[33]
E. Sparks and et al. MLI: An API for distributed machine learning. In ICDM, pages 1187--1192, 2013.
[34]
S. Sridhar and et al. An approximate, efficient LP solver for LP rounding. In NIPS, pages 2895--2903, 2013.
[35]
A. K. Sujeeth and et al. OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning. In ICML, pages 609--616, 2011.
[36]
S. Tu and et al. Speedy transactions in multicore in-memory databases. In SOSP, pages 18--32, 2013.
[37]
M. Zaharia and et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012.
[38]
M. Zaki and et al. Parallel classification for data mining on shared-memory multiprocessors. In ICDE, pages 198--205, 1999.
[39]
M. J. Zaki and et al. New algorithms for fast discovery of association rules. In KDD, pages 283--286, 1997.

Cited By

View all
  • (2024)Preparing for Future Heterogeneous Systems Using Migrating ThreadsProceedings of the 3rd International Workshop on Extreme Heterogeneity Solutions10.1145/3642961.3643801(15-22)Online publication date: 2-Mar-2024
  • (2023)SDPipe: A Semi-Decentralized Framework for Heterogeneity-Aware Pipeline-parallel TrainingProceedings of the VLDB Endowment10.14778/3598581.359860416:9(2354-2363)Online publication date: 10-Jul-2023
  • (2023)Elastic Averaging for Efficient Pipelined DNN TrainingProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577484(380-391)Online publication date: 25-Feb-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 7, Issue 12
August 2014
296 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2014
Published in PVLDB Volume 7, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Preparing for Future Heterogeneous Systems Using Migrating ThreadsProceedings of the 3rd International Workshop on Extreme Heterogeneity Solutions10.1145/3642961.3643801(15-22)Online publication date: 2-Mar-2024
  • (2023)SDPipe: A Semi-Decentralized Framework for Heterogeneity-Aware Pipeline-parallel TrainingProceedings of the VLDB Endowment10.14778/3598581.359860416:9(2354-2363)Online publication date: 10-Jul-2023
  • (2023)Elastic Averaging for Efficient Pipelined DNN TrainingProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577484(380-391)Online publication date: 25-Feb-2023
  • (2022)Distributed learning of fully connected neural networks using independent subnet trainingProceedings of the VLDB Endowment10.14778/3529337.352934315:8(1581-1590)Online publication date: 1-Apr-2022
  • (2022)In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data ShuffleProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526150(1286-1300)Online publication date: 10-Jun-2022
  • (2021)Database technology for the massesProceedings of the VLDB Endowment10.14778/3476249.347629614:11(2483-2490)Online publication date: 27-Oct-2021
  • (2021)ParaXProceedings of the VLDB Endowment10.14778/3447689.344769214:6(864-877)Online publication date: 12-Apr-2021
  • (2021)Model averaging in distributed machine learning: a case study with Apache SparkThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00664-730:4(693-712)Online publication date: 15-Apr-2021
  • (2020)Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized TrainingProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378499(401-416)Online publication date: 9-Mar-2020
  • (2020)A Unified Multi-view Clustering Algorithm Using Multi-objective Optimization Coupled with Generative ModelACM Transactions on Knowledge Discovery from Data10.1145/336567314:1(1-31)Online publication date: 3-Feb-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media