[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Online optimizations driven by hardware performance monitoring

Published: 10 June 2007 Publication History

Abstract

Hardware performance monitors provide detailed direct feedback about application behavior and are an additional source of infor-mation that a compiler may use for optimization. A JIT compiler is in a good position to make use of such information because it is running on the same platform as the user applications. As hardware platforms become more and more complex, it becomes more and more difficult to model their behavior. Profile information that captures general program properties (like execution frequency of methods or basic blocks) may be useful, but does not capture sufficient information about the execution platform. Machine-level performance data obtained from a hardware performance monitor can not only direct the compiler to those parts of the program that deserve its attention but also determine if an optimization step actually improved the performance of the application.
This paper presents an infrastructure based on a dynamic compiler+runtime environment for Java that incorporates machine-level information as an additional kind of feedback for the compiler and runtime environment. The low-overhead monitoring system provides fine-grained performance data that can be tracked back to individual Java bytecode instructions. As an example, the paper presents results for object co-allocation in a generational garbage collector that optimizes spatial locality of objects on-line using measurements about cache misses. In the best case, the execution time is reduced by 14% and L1 cache misses by 28%.

References

[1]
Perfmon project. http://www.hpl.hp.com/research/linux/perfmon/.
[2]
IA-32 Intel Architecture Software Developer's Manual, Volume 3: System Programming Guide. 2005.
[3]
A.-R. Adl-Tabatabai, R. L. Hudson, M. J. Serrano, and S. Subramoney. Prefetch injection based on hardware monitoring and object metadata. In Proc. of the ACM Conf. on Programming Language Design and Implementation (PLDI 2004), pages 267--276, New York, NY, USA, 2004. ACM Press.
[4]
B. Alpern, C. R. Attanasio, J. J. Barton, A. Cocchi, S. F. Hummel, D. Lieber, T. Ngo, M. F. Mergen, J. C. Shepherd, and S. Smith. Implementing Jalapeno in Java. In Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPLSA 1999), pages 314--324, 1999.
[5]
B. Alpern, D. Attanasio, J. Barton, M. Burke, P. Cheng, J.-D. Choi, A. Cocchi, S. Fink, D. Grove, M. Hind, S. F. Hummel, D. Lieber, V. Litvinov, T. Ngo, M. Mergen, V. Sarkar, M. Serrano, J. Shepherd, S. Smith, V. C. Sreedhar, H. Srinivasan, and J. Whaley. The Jalapeno virtual machine. IBM Systems Journal, Java Performance Issue, 39(1), 2000.
[6]
A. W. Appel. Simple generational garbage collection and fast allocation. Softw. Pract. Exper., 19(2):171--183, 1989.
[7]
M. Arnold, S. Fink, D. Grove, M. Hind, and P. F. Sweeney. Adaptive optimization in the Jalapeno JVM. In Proc. of the Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2000), pages 47--65, New York, 2000. ACM Press.
[8]
M. Arnold, M. Hind, and B. G. Ryder. Online feedback-directed optimization of java. In Proc. of the Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2002), pages 111--129, New York, USA, 2002. ACM Press.
[9]
S. M. Blackburn, P. Cheng, and K. S. McKinley. Myths and realities: the performance impact of garbage collection. In SIGMETRICS 2004/PERFORMANCE 2004: Proceedings of the joint international conference on Measurement and modeling of computer systems, pages 25--36, New York, NY, USA, 2004. ACM Press.
[10]
S. M. Blackburn, P. Cheng, and K. S. McKinley. Oil and water? high performance garbage collection in java with mmtk. In ICSE '04: Proceedings of the 26th International Conference on Software Engineering, pages 137--146. IEEE Computer Society, 2004.
[11]
S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In Proc. of the Conf. on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA 2006), New York, Oct. 2006. ACM Press.
[12]
P. P. Chang, S. A. Mahlke, and W. W. Hwu. Using profile information to assist classic code optimizations. Software Practice and Experience, 21(12):1301--1321, Dec 1991.
[13]
T. M. Chilimbi, B. Davidson, and J. R. Larus. Cache-conscious structure definition. In Procof the ACM SIGPLAN'99 Conf. on Programming Language Design and Implementation (PLDI 1999), pages 13--24, New York, NY, USA, 1999. ACM Press.
[14]
M. Cierniak, G.-Y. Lueh, and J. M. Stichnoth. Practicing judo: Java under dynamic optimizations. In Procof the ACM Conf on Programming Language Design and Implementation (PLDI 2000), pages 13--26, New York, NY, USA, 2000. ACM Press.
[15]
A. Georges, D. Buytaert, L. Eeckhout, and K. D. Bosschere. Method-level phase behavior in java workloads. In Proc. of the ACM SIGPLAN Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2004), pages 270--287, New York, NY, USA, 2004. ACM Press.
[16]
M. Hauswirth, P. F. Sweeney, A. Diwan, and M. Hind. Vertical profiling: understanding the behavior of object-priented applications. In Proc. of Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2004), pages 251--269, New York, NY, USA, 2004. ACM Press.
[17]
X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The garbage collection advantage: improving program locality. In Procof the ACM Confon Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2004), pages 69--80, New York, NY, USA, 2004. ACM Press.
[18]
X. Huang, B. T. Lewis, and K. S. McKinley. Dynamic code management: Improving whole program code locality in managed runtimes. In VEE '06: Proc. of the second international Conf. on Virtual Execution Environments, pages 133--143, New York, USA, 2006. ACM Press.
[19]
T. Kistler and M. Franz. Automated data-member layout of heap objects to improve memory-hierarchy performance. ACM Trans. Program. Lang. Syst., 22(3):490--505, 2000.
[20]
J. Lau, M. Arnold, M. Hind, and B. Calder. Online performance auditing: Using hot optimizations without getting burned. In Proc. Conf. on Programming Language Design and Implementation (PLDI 2006), pages 239--251, New York, USA, 2006. ACM Press.
[21]
K. Pettis and R. Hansen. Profile guided code positioning. In Proc. ACM SIGPLAN'90 Conf. on Prog. Language Design and Implementation, pages 16--27, White Plains, N.Y., June 1990. ACM.
[22]
S. Rubin, R. Bodik, and T. Chilimbi. An efficient Profile-Analysis framework for data-layout optimizations. In Procof the Sympon Principles Of Programming Languages (POPL 2002), pages 140--153, New York, NY, USA, 2002. ACM Press.
[23]
F. Schneider and T. Gross. Using platform-specific performance counters for dynamic compilation. In Proc. of the International Workshop on Compilers for Parallel Computing (LCPC 2005), Oct. 2005.
[24]
Y. Shuf, M. Gupta, H. Franke, A. Appel, and J. P. Singh. Creating and preserving locality of java applications at allocation and garbage collection times. In Proc. of the Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2002), pages 13--25, New York, 2002. ACM Press.
[25]
D. Siegwart and M. Hirzel. Improving locality with parallel hierarchical copying gc. In Proceedings of the 2006 International Symposium on Memory Management (ISMM 2006), pages 52--63, New York, USA, 2006. ACM Press.
[26]
B. Sprunt. Pentium 4 performance monitoring features. In IEEE Micro, pages 72--82, July-August 2002.
[27]
T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. A dynamic optimization framework for a java just-in-time compiler. In Proc. of the ACM Conf. on Object Oriented Programming, Systems, Languages, and Applications (OOPLSA 2001), pages 180--195, New York, NY, USA, 2001. ACM Press.
[28]
The Standard Performance Evaluation Corporation. SPEC JBB2000 Benchmark. http://www.spec.org/jbb2000/.
[29]
The Standard Performance Evaluation Corporation. SPEC JVM98 Benchmarks. http://www.spec.org/osg/jvm98, 1996.
[30]
D. Ungar. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Proc. of the Software Engineering Symposium on Practical Software Development Environments (SDE 1), pages 157--167, New York, USA, 1984. ACM Press.

Cited By

View all
  • (2021)Modelling Application Cache Behavior using Regression Models2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00284(1879-1886)Online publication date: Jul-2021
  • (2017)SmartGC: Online Memory Management Prediction for PaaS Cloud ModelsOn the Move to Meaningful Internet Systems. OTM 2017 Conferences10.1007/978-3-319-69462-7_25(370-388)Online publication date: 20-Oct-2017
  • (2016)Efficient Management for Hybrid Memory in Managed Language RuntimeNetwork and Parallel Computing10.1007/978-3-319-47099-3_3(29-42)Online publication date: 28-Oct-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 42, Issue 6
Proceedings of the 2007 PLDI conference
June 2007
491 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1273442
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2007
    508 pages
    ISBN:9781595936332
    DOI:10.1145/1250734
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2007
Published in SIGPLAN Volume 42, Issue 6

Check for updates

Author Tags

  1. Java
  2. dynamic optimization
  3. hardware performance monitors
  4. just-in-time compilation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Modelling Application Cache Behavior using Regression Models2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00284(1879-1886)Online publication date: Jul-2021
  • (2017)SmartGC: Online Memory Management Prediction for PaaS Cloud ModelsOn the Move to Meaningful Internet Systems. OTM 2017 Conferences10.1007/978-3-319-69462-7_25(370-388)Online publication date: 20-Oct-2017
  • (2016)Efficient Management for Hybrid Memory in Managed Language RuntimeNetwork and Parallel Computing10.1007/978-3-319-47099-3_3(29-42)Online publication date: 28-Oct-2016
  • (2014)Reconfigurable vertical profiling framework for the android runtime systemACM Transactions on Embedded Computing Systems10.1145/2544375.254437913:2s(1-25)Online publication date: 27-Jan-2014
  • (2014)Toward the efficient use of multiple explicitly managed memory subsystems2014 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2014.6968756(123-131)Online publication date: Sep-2014
  • (2014)Exploiting Hardware Monitoring in Software Engineering10.1016/B978-0-12-800162-2.00002-6(53-101)Online publication date: 2014
  • (2018)Bayonet: probabilistic inference for networksACM SIGPLAN Notices10.1145/3296979.319240053:4(586-602)Online publication date: 11-Jun-2018
  • (2017)SmartGC: Online Memory Management Prediction for PaaS Cloud ModelsOn the Move to Meaningful Internet Systems. OTM 2017 Conferences10.1007/978-3-319-69462-7_25(370-388)Online publication date: 20-Oct-2017
  • (2016)Efficient Management for Hybrid Memory in Managed Language RuntimeNetwork and Parallel Computing10.1007/978-3-319-47099-3_3(29-42)Online publication date: 30-Sep-2016
  • (2014)Efficient code management for dynamic multi-tiered compilation systemsProceedings of the 2014 International Conference on Principles and Practices of Programming on the Java platform: Virtual machines, Languages, and Tools10.1145/2647508.2647513(51-62)Online publication date: 23-Sep-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media