Abstract
Software optimization techniques are highly reliant on program behavior to deliver high performance. A key element with these techniques is to identify program paths that are likely to achieve the greatest performance benefits at runtime. Several approaches have been proposed to address this problem. However, many of them fail to cover larger optimization scope as they are restricted to loops or procedures. This paper introduces a novel approach for representing and analyzing complete program paths. Unlike the whole-program paths (WPPs) approach that relies on a DAG to represent program paths, our program trace is processed into a suffix-array that can enable very fast searching algorithms that run with time O(ln (N)), N being the length of the trace. This allows to process reasonable trace sizes offline, avoiding the high runtime overhead incurred by WPPs, while accurately characterizing hot paths. Our evaluation shows impressive performance results, with almost 48% of the code being covered by hot paths. We also demonstrate the effectiveness of our approach to optimize for power. For this purpose, an adaptive cache resizing scheme is used that shows energy savings in the order of 12%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bala, V.: Low overhead path profiling. Technical Report HPL-96-87, Hewlett Packard Labs (1996)
Balasubramonian, R., Albonesi, D.H., Buyuktosunoglu, A., Dwarkadas, S.: Memory hierarchy reconfiguration for energy and performance in general purpose processor architectures. In: Proc. of the 33th Int’l Conf. on Microarchitecture, December 2000, pp. 245–257 (2000)
Ball, T., Larus, J.R.: Optimally profiling and tracing programs. ACM Transactions on Programming Languages and Systems 16(4), 1319–1360 (1994)
Ball, T., Larus, J.R.: Efficient path profiling. In: Proc. of the 29th Annual Int’l Symposium on Microarchitecture (December 1996)
Ball, T.: What’s in a Region? or computing control dependence regions in near-linear time for reducible control flow. ACM Letters on Programming Languages and Systems 2(1-4), 1–16 (1993)
Bodin, F., Rohou, E., Seznec, A.: System for Assembly-Language Transformation and Optimization. In: Proc. of the 6th Workshop on Compilers for Parallel Computers (December 1996)
Burger, D., Austin, T.: The SimpleScalar Tool Set, Version 2.0. Computer Architecture News, pp. 13–25 (1997)
Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, K.: Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems 13(4), 451–490 (1991)
Faraboschi, P., Brown, G., Fisher, J.A., Desoli, G., Homewood, F.: Lx: A Technology Platform for Customizable VLIW Embedded Processing. In: Proc. of the 27th Int’l Symposium on Computer Architecture (June 2000)
Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. In: Proc. of the ACM Symposium on the Theory of Computing (2000)
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In: Proc. of the 4th IEEE Int’l Workshop on Workload Characterization, December 2001, pp. 3–14 (2001)
Jacobson, Q., Rotenberg, E., Smith, J.: Path-Based Next Trace Prediction. In: Proc. of the 30th Int’l Symposium on Microarchitecture (November 1997)
Karp, R.M., Miller, R.E., Rosenberg, A.L.: Rapid Identification of Repeated Patterns in Strings, Arrays and Trees. In: Proc. of the 4th ACM Symposium on Theory of Computing (1972)
Larus, J.R.: Whole Program Paths. In: Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation (May 1999)
Manber, U., Myers, G.: Suffix Arrays: A New Method for on-line String Searches. In: Proc. of 1st ACM-SIAM SODA, pp. 319–327 (1990)
Nevill-Manning, C.G., Witten, I.H.: Identifying Hierarchical Structure in Sequences. Journal of Artificial Intelligence Research 7, 67–82 (1997)
Plezkun, A.R.: Techniques for compressing program address traces. In: Proc. of the 27th Int’l Conf. on Microarchitecture, pp. 32–40 (1994)
Pokam, G., Bodin, F.: Energy-efficiency potential of a phase-based cache resizing scheme for embedded systems. In: Proc. of the 8th Int’l Worskhop on Interaction between Compilers and Computer Architectures (February 2004)
Shivakumar, P., Jouppi, N.: Cacti 3.0: An integrated cache timing power, and area model. Technical report, DEC Western research Lab (2002)
Young, C., Smith, M.: Better Global Scheduling Using Path Profiles. In: Proc. of the 30th Int’l Symposium on Microarchitecture (December 1998)
Zhang, C., Vahid, F., Najjar, W.: A highly configurable cache architecture for embedded systems. In: Proc. of the 30th Int’l Symposium on Computer Architecture (June 2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pokam, G., Bodin, F. (2005). An Offline Approach for Whole-Program Paths Analysis Using Suffix Arrays. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds) Languages and Compilers for High Performance Computing. LCPC 2004. Lecture Notes in Computer Science, vol 3602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532378_26
Download citation
DOI: https://doi.org/10.1007/11532378_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28009-5
Online ISBN: 978-3-540-31813-2
eBook Packages: Computer ScienceComputer Science (R0)