[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Software profiling for hot path prediction: less is more

Published: 01 November 2000 Publication History

Abstract

Recently, there has been a growing interest in exploiting profile information in adaptive systems such as just-in-time compilers, dynamic optimizers and, binary translators. In this paper, we show that sophisticated software profiling schemes that provide highly accurate information in an offline setting are ill-suited for these dynamic code generation systems. We experimentally demonstrate that hot path predictions must be made early in order to control the rising cost of missed opportunity that result from the prediction delay. We also show that existing sophisticated path profiling schemes, if used in an online setting, offer no prediction advantages over simpler schemes that exhibit much lower runtime overheads.Based on these observation we developed a new low-overhead software profiling scheme for hot path prediction. Using an abstract metric we compare our scheme to path profile based prediction and show that our scheme achieves comparable prediction quality. In our second set of experiments we include runtime overhead and evaluate the performance of our scheme in a realistic application: Dynamo, a dynamic optimization system. The results show that our prediction scheme clearly outperforms path profile based prediction and thus confirm that less profiling as exhibited in our scheme will actually lead to more effective hot path prediction.

References

[1]
Ammons, G., Ball, T., and Larus, J.R. Exploiting hardware performance counters with flow and context sensitive profiling. In Proc. of the 1997 Conf. on Programming Language Design and Implementation, June 1997.
[2]
Anderson, J.M., Bere, L.M., Dean, J., Ghemawat, S., Henzinger, M.R., Leung, S.A., Sites, R.L., Vandevoorde, M.T., Waldspurger, C.A., and Weihl, W.E. Continuous profiling: Where have all the cycles gone? In Proc. of the 16 th ACM Syrup. on Operating Systems Principles, St. Malo, France. October 1997.
[3]
Bala, V., Duesterwald, E., and Banerjia, S. Transparent dynamic optimization: The design and implementation of Dynamo. Hewlett Packard Laboratories Technical Report HPL-1999-78. June 1999.
[4]
Bala, V., Duesterwald, E., and Banerjia, S. Dynamo: A transparent runtime optimization system. In Proc. of the 2000 Conf. on Programming Language Design and Implementation. Vancouver, B.C., June 2000.
[5]
Ball, T. and Larus, J.R. Efficient path profiling. In Proc. of the 29 th Int. Symp. on Microarchitecture, Paris. 1996.
[6]
Ball, T., Mataga, P. and Sagiv, M. Edge profiling versus path profiling: The showdown. In Proc. of the 25 th Symp. on Principles of Programming Languages, San Diego, CA, January 1998.
[7]
Burke, M., Choi, J.-D., Fink, S., Grove, D., Hind, M., Sarkar, V., Serrano, M.J., Sreedhar, V.C., Srinivasa, H. The Jalapeno Dynamic Optimizing Compiler for Java. In Proc. of the 1999 ACM dava Grande Conference, San Francisco, CA. June 1999.
[8]
Chang, P., Mahlke, S.A., and Hwu, W.M. Using profile information to assist classic code optimization. Software - Practice and Experience, Vol. 21, No. 12, December 1991.
[9]
Calder, B. and Grunwald, D. Fast and accurate instruction fetch and branch prediction. In Proc. of the 21st Int. Syrup. on Computer Architecture. April 1994.
[10]
Cmelik, R.F. and Keppel, D. Shade: a fast instruction set simulator for execution profiling. Technical Report UWCSE- 93-06-06, Dept. Comp. Science and Engineering, Univ. Washington. 1993.
[11]
Ebeioglu, K., Altrnan E., Sathaye, S., and Gschwind, M. Execution-based scheduling for VLIW architectures. In Proc. of Europar'99, Lecture Notes in Computer Science 1685, Springer-Verlag 1999.
[12]
McFarling, S., and Hennesy, J. Reducing the cost of branches. In Proc. of the 13 th lnt. Syrup. on Computer Architecture. 1986.
[13]
Merten, C.M., Trick, A., George, C.N., Gyllenhaal, J.C., and Hwu, W.-M.W. A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization. In Proc. of the 26 th Int. Syrup. on Computer Architecture. Atlanta, Georgia. 1999,
[14]
Pan, S, So, K., and Rahmeh, J. Improving the accuracy of dynamic branch prediction using branch correlation. In Proc. of the 5th Int. Conf. on Architectural Support for Programming Languages and Operating Systems. 1992.
[15]
Rotenberg, E., Bennett, S., and Smith, J.E. Trace cache: a low latency approach to high bandwidth instruction fetching. In Proc. of the 29 th lnt. Symp. on Microarchitecture, Paris. 1996.
[16]
Sannella, M., Maloney, J., Freeman-Benson, B., and Borning, A. Multi-way versus one-way constraints in user interfaces: experiences with the DeltaBlue algorithm. Software - Practice and Experience 23, 5 (May). 529-566. 1993.
[17]
Sathaye, S., Ledak, P., LeBlanc, J., Kosonocky, S., Gschwind, M., Fritts, J., Filan, Z., Bright, A., AppenzeUer, D., Airman, E., and Agricola, C. BOA: Targeting multigigahertz with binary translation. In Proc. of the 1999 Workshop on Binary Translation, Newport Beach, CA., October 1999.
[18]
Smith, M. Private communication, March 2000.
[19]
Yeh, T. and Patt, Y. A comparison of dynamic branch predictors that use two levels of branch history. In Proc. of the 20 th Int. Symp. on Computer Architecture. 1993.
[20]
Young, C. and Smith, M. Static correlated branch prediction. ACM Transactions on Programming Languages and Systems, Vol. 21, No. 5, September 1999.
[21]
Zhang, X. et al. System support for automatic profiling and optimization. In Proc. of the 16 th ACM Symposium on Operating Systems Principles, St. Malo, France. Oct. 1997.

Cited By

View all
  • (2023)MESA: Microarchitecture Extensions for Spatial Architecture GenerationProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589084(1-14)Online publication date: 17-Jun-2023
  • (2023)Leveraging Hardware Performance Counters for Efficient Classification of Binary Packers2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00252(1859-1864)Online publication date: 1-Nov-2023
  • (2020)Exploring Impact of Profile Data on Code Quality in the HotSpot JVMACM Transactions on Embedded Computing Systems10.1145/339189419:6(1-26)Online publication date: 3-Oct-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 35, Issue 11
Nov. 2000
269 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/356989
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2000
Published in SIGPLAN Volume 35, Issue 11

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)76
  • Downloads (Last 6 weeks)17
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)MESA: Microarchitecture Extensions for Spatial Architecture GenerationProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589084(1-14)Online publication date: 17-Jun-2023
  • (2023)Leveraging Hardware Performance Counters for Efficient Classification of Binary Packers2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00252(1859-1864)Online publication date: 1-Nov-2023
  • (2020)Exploring Impact of Profile Data on Code Quality in the HotSpot JVMACM Transactions on Embedded Computing Systems10.1145/339189419:6(1-26)Online publication date: 3-Oct-2020
  • (2019)Exploiting Vector Processing in Dynamic Binary TranslationProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337844(1-10)Online publication date: 5-Aug-2019
  • (2019)Characterizing Dominant Program Behavior Using the Execution-Time Variance of the Call Structure2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS.2019.00018(117-129)Online publication date: Apr-2019
  • (2018)Optimising Dynamic Binary Modification Across ARM MicroarchitecturesProceedings of the 2018 ACM/SPEC International Conference on Performance Engineering10.1145/3184407.3184425(28-39)Online publication date: 30-Mar-2018
  • (2018)SpeedooProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180229(811-821)Online publication date: 27-May-2018
  • (2018)Evaluation of Timing Side-Channel Leakage on a Multiple-Target Dynamic Binary Translator2018 Symposium on High Performance Computing Systems (WSCAD)10.1109/WSCAD.2018.00039(198-204)Online publication date: Oct-2018
  • (2018)Fog-Assisted Translation: Towards Efficient Software Emulation on Heterogeneous IoT Devices2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2018.00196(1268-1277)Online publication date: May-2018
  • (2017)HyperMAMBO-X64ACM SIGPLAN Notices10.1145/3140607.305075652:7(228-241)Online publication date: 8-Apr-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media