[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2025113.2025133acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Managing performance vs. accuracy trade-offs with loop perforation

Published: 09 September 2011 Publication History

Abstract

Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. Loop perforation provides a general technique to trade accuracy for performance by transforming loops to execute a subset of their iterations. A criticality testing phase filters out critical loops (whose perforation produces unacceptable behavior) to identify tunable loops (whose perforation produces more efficient and still acceptably accurate computations). A perforation space exploration algorithm perforates combinations of tunable loops to find Pareto-optimal perforation policies. Our results indicate that, for a range of applications, this approach typically delivers performance increases of over a factor of two (and up to a factor of seven) while changing the result that the application produces by less than 10%.

References

[1]
[2]
D. N. A. Asuncion. UCI machine learning repository, 2007.
[3]
E. Amigo, J. Gonzalo, and J. Artiles. A comparison of extrinsic clustering evaluation metrics based on formal constraints. In Information Retrieval Journal. Springer Netherlands, July 2008.
[4]
J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao, A. Edelman, and S. Amarasinghe. Petabricks: A language and compiler for algorithmic choice. In PLDI '09, Jun 2009.
[5]
W. Baek and T. Chilimbi. Green: a framework for supporting energy-conscious programming using controlled approximation. In PLDI '10, 2010.
[6]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT '08, Oct 2008.
[7]
K. Brandenburg. MP3 and AAC explained. In AES 17th International Conference on High-Quality Audio Coding, 1999.
[8]
S. Chakradhar, A. Raghunathan, and J. Meng. Best-Effort Parallel Execution Framework for Recognition and Mining Applications. In IPDPS, 2009.
[9]
L. Chakrapani, K. Muntimadugu, A. Lingamneni, J. George, and K. Palem. Highly energy and performance efficient embedded computing through approximately correct arithmetic: A mathematical foundation and preliminary experimental validation. In International conference on Compilers, architectures and synthesis for embedded systems, 2008.
[10]
S. Chaudhuri, S. Gulwani, and R. Lublinerman. Continuity analysis of programs. In POPL '10.
[11]
S. Chaudhuri, S. Gulwani, R. Lublinerman, and S. Navidpour. Proving Programs Robust. In FSE '11, 2011.
[12]
J. Deutscher and I. Reid. Articulated body motion capture by stochastic search. International Journal of Computer Vision, 2005.
[13]
J. Flinn and M. Satyanarayanan. Energy-aware adaptation for mobile applications. In SOSP '99.
[14]
M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proc. 1998 IEEE Intl. Conf. Acoustics Speech and Signal Processing, pages 1381--1384.
[15]
J. Hartigan and M. Wong. A k-means clustering algorithm. Journal of the Royal Statistical Society C, 28:100--108, 1979.
[16]
H. Hoffmann, S. Misailovic, S. Sidiroglou, A. Agarwal, and M. Rinard. Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures . Technical Report MIT-CSAIL-TR-2009-042, MIT, Sept. 2009.
[17]
H. Hoffmann, S. Sidiroglou, M. Carbin, S. Misailovic, A. Agarwal, and M. Rinard. Dynamic Knobs for Responsive Power-Aware Computing. In ASPLOS '11, March 2011.
[18]
M. Kalos and P. Whitlock. Monte Carlo Methods. Wiley-VCH, 2008.
[19]
P. Keleher, J. Hollingsworth, and D. Perkovic. Exposing application alternatives. In ICDCS '99.
[20]
C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In CGO '04.
[21]
J. Meng, A. A. Raghunathan, and S. B. Chakradhar. Exploiting the Forgiving Nature of Applications for Scalable Parallel Execution. In IPDPS, 2010.
[22]
S. Misailovic, D. Kim, and M. Rinard. Automatic Parallelization with Statistical Accuracy Bounds. Technical Report MIT-CSAIL-TR-2010-007, MIT, 2010.
[23]
S. Misailovic, D. Kim, and M. Rinard. Parallelizing Sequential Programs With Statistical Accuracy Test. Technical Report MIT-CSAIL-TR-2010-038, 2010.
[24]
S. Misailovic, D. Roy, and M. Rinard. Probabilistic and Statistical Analysis of Perforated Patterns. Technical Report MIT-CSAIL-TR-2011-003, MIT, 2011.
[25]
S. Misailovic, D. Roy, and M. Rinard. Probabilistically Accurate Program Transformations. In SAS '11, 2011.
[26]
S. Misailovic, S. Sidiroglou, H. Hoffmann, and M. Rinard. Quality of Service Profiling. In ICSE, 2010.
[27]
N. Nethercote and J. Seward. Valgrind A Program Supervision Framework. Electronic Notes in Theoretical Computer Science, 2003.
[28]
P. Pillai and K. Shin. Real-time dynamic voltage scaling for low-power embedded operating systems. In SOSP '01, page 102, 2001.
[29]
R. Ribler, J. Vetter, H. Simitci, and D. Reed. Autopilot: adaptive control of distributed applications. In High Performance Distributed Computing, July 1998.
[30]
M. Rinard. Using early phase termination to eliminate load imbalances at barrier synchronization points. In OOPSLA '07.
[31]
M. Rinard. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th annual international conference on Supercomputing, 2006.
[32]
M. Rinard, H. Hoffmann, S. Misailovic, and S. Sidiroglou. Patterns and statistical analysis for understanding reduced resource computing. In Onward! '10.
[33]
J. Sorber, A. Kostadinov, M. Garber, M. Brennan, M. D. Corner, and E. D. Berger. Eon: a language and runtime system for perpetual systems. In SenSys '07.
[34]
P. Stanley-Marbell and D. Marculescu. Deviation-Tolerant Computation in Concurrent Failure-Prone Hardware. Technical Report ESR-2008-01, Eindhoven University of Technology, January 2008.
[35]
C. Tapus, I. Chung, and J. Hollingsworth. Active harmony: Towards automated performance tuning. In Supercomputing, ACM/IEEE 2002 Conference, 2002.
[36]
G. Wallace. The JPEG still picture compression standard. IEEE Transactions on Consumer Electronics, 1991.
[37]
R. Whaley and J. Dongarra. Automatically tuned linear algebra software. In ACM/IEEE conference on Supercomputing (CDROM).
[38]
x264. http://www.videolan.org/x264.html.
[39]
J. Xiong, J. Johnson, R. W. Johnson, and D. Padua. SPL: A language and compiler for DSP algorithms. In PLDI '01.

Cited By

View all
  • (2024)Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics ApplicationsACM Transactions on Evolutionary Learning and Optimization10.1145/37039204:4(1-29)Online publication date: 15-Nov-2024
  • (2024)Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN PruningProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651351(416-430)Online publication date: 27-Apr-2024
  • (2024)ARCTIC: Approximate Real-Time Computing in a Cache-Conscious Multicore EnvironmentIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.338444243:10(2944-2957)Online publication date: Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE '11: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
September 2011
548 pages
ISBN:9781450304436
DOI:10.1145/2025113
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. loop perforation
  2. profiling
  3. quality of service

Qualifiers

  • Research-article

Conference

ESEC/FSE'11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)31
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics ApplicationsACM Transactions on Evolutionary Learning and Optimization10.1145/37039204:4(1-29)Online publication date: 15-Nov-2024
  • (2024)Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN PruningProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651351(416-430)Online publication date: 27-Apr-2024
  • (2024)ARCTIC: Approximate Real-Time Computing in a Cache-Conscious Multicore EnvironmentIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.338444243:10(2944-2957)Online publication date: Oct-2024
  • (2024)Interleaved Execution of Approximated CUDA Kernels in Iterative Applications2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00017(60-67)Online publication date: 20-Mar-2024
  • (2024)Exact and Approximate Tasks Computation in IoT NetworksIEEE Internet of Things Journal10.1109/JIOT.2023.331669911:5(7974-7988)Online publication date: 1-Mar-2024
  • (2024)Lightweight Cryptanalysis of IoT Encryption Algorithms: Is Quota Sampling the Answer?IEEE Access10.1109/ACCESS.2024.347201412(147619-147639)Online publication date: 2024
  • (2024)A Quality-Aware Voltage Overscaling Framework to Improve the Energy Efficiency and Lifetime of TPUs Based on Statistical Error ModelingIEEE Access10.1109/ACCESS.2024.342201212(92181-92197)Online publication date: 2024
  • (2024)Syntactic and Semantic Analysis of Temporal Assertions to Support the Approximation of RTL DesignsJournal of Electronic Testing10.1007/s10836-024-06115-940:2(199-214)Online publication date: 23-Apr-2024
  • (2023)Hardware and Software Support for Mixed Precision Computing: a Roadmap for Embedded and HPC Systems2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137092(1-6)Online publication date: Apr-2023
  • (2023)Toward a Life Cycle Assessment for the Carbon Footprint of DataProceedings of the 2nd Workshop on Sustainable Computer Systems10.1145/3604930.3605724(1-9)Online publication date: 9-Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media