Abstract
Reducing the traffic between CPU and main memory is one of the main issues in the optimization of programs for load/store architectures. It is the register allocation module of optimizing compliers that keeps this traffic low by cleverly associating the program variables to the CPU registers. Since register allocation takes place during code generation and works on the intermediate code produced by the compiler front-end, the structure of such a code, which closely depends on the structure of the source code, heavily affects the effectiveness of register allocation. Proper techniques can be used to restructure the source programs in such a way to produce intermediate code able to take advantage of advanced register allocation schemes.
In this paper we analyze one of these techniques called unroll-and-jam. In particular we find the fractional optimal unroll-and-jam transformation valid for a large class of computing intensive programs. The paper presents the analytical model of the optimal unroll-and-jam and a method to compute the unrolling parameters numerically.
Work carried out under the financial support of the Ministero dell'Universita'e della Ricerca Scientifica e Technological (MURST) in the Project “Methodologies and Tools of High Performance Systems for Multimedia Applications”.
Preview
Unable to display preview. Download preview PDF.
References
R. Bertsekas. “Constrained Optimization and Lagrange Multipliers”. Academic Press, 1982.
G.J. Chaitin. “Register Allocation and Spilling via Graph Coloring”. “Proceeding of the ACM SIGPLAN Symp. on Compiler Construction”, 17(6):98–105, June 1982.
F.C. Chow and Hennessy, J.L. “Priority-Based Coloring Approach to Register Allocation”. “ACM Trans. on Programming Language and Systems”, 12(4):501–536, 1990.
K. Dowd. “High Performance Computing”. O'Reily & Associates, Inc, Sebastopol, Ca 95472, 1988.
J.L. Hennesy and D.A. Patterson, “Computer Architecture: A Quantitative Approach”. Morgan Kaufmann Publishers, Inc., San Mateo, Ca 94403, 1990.
N. Zingirian and M. Maresca. “External Loop Unrolling of Image Processing Programs: Optimal Register Allocation for RISC Architectures”. “Proc. of IV Int. Workshop on Computer Architecture for Machine Perception, Boston”, October 1997.
M. Maresca P. Baglietto, M. Migliardi and N. Zingirian. “Image Processing on High Performance RISC System”. “Proceeding of the IEEE”, 84(7):917–930, 1996.
N. Zingirian and M. Maresca. “Scheduling Image Processing Activities on Instruction Level Parallel RISC Systems Through Program Transformations”. “Lecture Notes in Computer Science, HPCN97”, 1225, Vienna, April 1997.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Zingirian, N., Maresca, M. (1999). Finding the optimal unroll-and-jam. In: Sloot, P., Bubak, M., Hoekstra, A., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1999. Lecture Notes in Computer Science, vol 1593. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0100624
Download citation
DOI: https://doi.org/10.1007/BFb0100624
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65821-4
Online ISBN: 978-3-540-48933-7
eBook Packages: Springer Book Archive