[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Evolutionary Algorithms for Instruction Scheduling, Operation Merging, and Register Allocation in VLIW Compilers

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Code generation for VLIW processors includes several optimization problems like code optimization, instruction scheduling, and register allocation. The high complexity of these problems usually does not allow the computation of the optimal solution. Instead, optimization techniques, e.g., based on heuristics, are used to find acceptable solutions in a reasonable time. List scheduling is a well known heuristic-based microcode compaction method, that bases its scheduling decisions on weights derived from dependency analysis of the input program. Additional information and methods have to be used in order to reach better code compaction. Also, more sophisticated code optimization and register allocation support better code compaction. In this paper, evolutionary algorithms are used as dynamic heuristics in code generation, which allows dynamic adaption to the given input program and target processor configuration. Three evolutionary algorithms for operation merging, instruction scheduling, and register allocation are presented and evaluated on an exemplary image processing application, which shows different processing characteristics in the subroutines. They outperform code generation based on static heuristics and allow compilation for restricted target architectures that cannot be handled by the static heuristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24

Similar content being viewed by others

Notes

  1. www.luis.uni-hannover.de/scientific_computing.html

  2. https://www.openmp.org

References

  1. Bahuleyan, J, Nagpal, R, Srikant, YN. (2010). Integrated energy-aware cyclic and acyclic scheduling for clustered VLIW processors. In: International symposium on parallel & distributed processing, Workshops and Phd Forum (IPDPSW).

  2. Briggs, P. (1992). Register allocation via graph coloring. PhD thesis: Rice University.

    Google Scholar 

  3. Codina, JM, Sanchez, J, Gonzalez, A. (2001). A unified modulo scheduling and register allocation technique for clustered processors. In: Proceedings of international conference on parallel architectures and compilation techniques.

  4. Eriksson, MV, Skoog, O, Kessler, CW. (2008). Optimal vs. heuristic integrated code generation for clustered VLIW architectures. In Proceedings of 11th international workshop on software & compilers for embedded systems (SCOPES ’08). New York: ACM.

  5. Fisher, JA. (1979). Optimization of horizontal microcode within and beyond basic blocks: an application of processor scheduling with resources. Tech. rep.: NY Univ. Courant Mathem.and Comp.Lab.

    Google Scholar 

  6. Gerlach, L, Marquardt, D, Payá Vayá, G, Liu, S, Weißbrich, M, Doclo, S, Blume, H. (2017). Analyzing the trade-off between power consumption and beamforming algorithm performance using a hearing aid ASIP. In: Proceedings of international conference on embedded computer systems: Architectures, Modeling and simulation.

  7. Giesemann, F, Payá-Vayá, G., Blume, H. (2012). A Hardware/software environment for specializing dynamic reconfigurable generic vliw-simd asip architecture. In: ICT. OPEN 2012 conference.

  8. Giesemann, F, Payá-Vayá, G., Blume, H, Limmer, M, Ritter, W. (2014). A comprehensive ASIC/FPGA prototyping environment for exploring embedded processing systems for advanced driver assistance applications. In: Proceedings of international conference on embedded computer systems: Architectures, Modeling, and simulation.

  9. Giesemann, F, Payá-Vayá G., Gerlach, L, Blume, H, Pflug, F, von Voigt, G. (2017). Using genetic algorithm approach to reduce register file pressure during instruction scheduling. In: Proceedings of international conference on embedded computer systems: Architectures, Modeling and simulation.

  10. Hennessy, JL, & Gross, T. (1983). Postpass code optimization of pipeline constraints. ACM Transactions on Programming Languages and Systems, 5(3), 422–448.

    Article  Google Scholar 

  11. Äijö, T., Jääskeläinen, P., Elomaa, T, Kultala, H, Takala, J. (2015). Integer linear programming-based scheduling for Transport Triggered Architectures. ACM Transactions on Architecture and Code Optimization, 12(4), 1–22. https://doi.org/10.1145/2845082.

    Article  Google Scholar 

  12. Kri, F, & Feeley, M. (2004). Genetic instruction scheduling and register allocation. In: XXIV international conference of the chilean computer science society.

  13. Lee, C, Lee, JK, Hwang, T, Tsai, SC. (2003). Compiler optimization on VLIW instruction scheduling for low power. ACM Transaction on Design Automation of Electronic Systems, 8(2), 252–268.

    Article  Google Scholar 

  14. Leupers, R, & Bashford, S. (2000). Graph-based code selection techniques for embedded processors. ACM Transactions on Design Automation of Electronic Systems, 5(4), 794–814.

    Article  Google Scholar 

  15. Lin, YC, You, YP, Lee, JK. (2006). Register allocation for VLIW DSP processors with irregular register files. In: 12Th workshop on compilers for parallel computers (CPC 2006), Coruña, Spain.

  16. Lorenz, M, Leupers, R, Marwedel, P, Drager, T, Fettweis, G. (2001). Low-energy DSP code generation using a genetic algorithm. In: Proceedings of the IEEE international conference on computer design: vlsi in computers and processors (ICCD), pp. 431–437. https://doi.org/10.1109/ICCD.2001.955062.

  17. Lorenz, M, Wehmeyer, L, Dräger, T. (2002). Energy aware compilation for DSPs with SIMD instructions. In: Proceedings of the joint conference on Languages, compilers and tools for embedded systems software and compilers for embedded systems - LCTES/SCOPES ’02, ACM Press, pp. 94–101. https://doi.org/10.1145/513829.513847.

  18. Lozano, RC, Carlsson, M, Drejhammar, F, Schulte, C. (2012). Constraint-based register allocation and instruction scheduling. In Lecture notes in computer science. https://doi.org/10.1007/978-3-642-33558-7_54 (pp. 750–766). Berlin: Springer.

  19. Lozano, RC, Carlsson, M, Blindell, GH, Schulte, C. (2014). Combinatorial spill code optimization and ultimate coalescing. In: Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems - LCTES ’14, ACM Press, pp. 23–32. https://doi.org/10.1145/2597809.2597815.

  20. Maldonado-Bascón, S., Lafuente-Arroyo, S, Gil-Jiménez, P., Gomez-Moreno, H, Lopez-Ferreras, F. (2007). Road-sign detection and recognition based on support vector machines. IEEE Transactions on Intelligent Transportation Systems, 8(2), 264–278. https://doi.org/10.1109/TITS.2007.895311.

    Article  Google Scholar 

  21. McGovern, A, Moss, E, Barto, AG. (2002). Building a basic block instruction scheduler with reinforcement learning and rollouts. Machine Learning, 49(2/3), 141–160. https://doi.org/10.1023/a:1017976211990.

    Article  Google Scholar 

  22. Nian, C, Yanxiang, H, Yong, C, Ximi, L, Qian, L. (2012). PSO Based instruction scheduling for low power. In: International conference on computer distributed control and intelligent environmental monitoring (CDCIEM), IEEE. https://doi.org/10.1109/cdciem.2012.129.

  23. Nicolau, A, Potasman, R, Wang, H. (1992). Register allocation, renaming and their impact on fine-grain parallelism. In: Languages and Compilers for Parallel Computing, Springer-Verlag, pp. 218–235. https://doi.org/10.1007/bfb0038667.

  24. Parikh, A, Kim, S, Kandemir, M, Vijaykrishnan, N, Irwin, M J. (2004). Instruction scheduling for low power. Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology, 37(1), 129–149. https://doi.org/10.1023/b:vlsi.0000017007.28247.f6.

    Article  Google Scholar 

  25. Payá-Vayá, G. (2011). Design and analysis of a generic VLIW processor for multimedia applications PhD thesis. Leibniz Universität Hannover: Institute of Microelectronic Systems.

    Google Scholar 

  26. Payá-Vayá, G, Martín Langerwerf, J, Pirsch, P. (2005). Rapanui: Rapid prototyping for media processor architecture exploration. In Hämäläinen, T D, Pimentel, A D, Takala, J, Vassiliadis, S (Eds.) Embedded computer systems: architectures, modeling, and simulation, lecture notes in computer science. https://doi.org/10.1007/11512622_5, (Vol. 3553 pp. 249–265). Berlin: Springer.

  27. Payá-Vayá, G., Martín-Langerwerf, J., Taptimthong, P., Pirsch, P. (2007). Design space exploration of media processors: A parameterized scheduler. In: 2007 international conference on embedded computer systems: Architectures, Modeling and simulation, pp. 41–49. https://doi.org/10.1109/ICSAMOS.2007.4285732.

  28. Payá-Vayá, G., Martín Langerwerf, J., Giesemann, F., Blume, H., Pirsch, P. (2009). Instruction merging to increase parallelism in VLIW architectures. In: International symposium on system-on-chip (SOC), pp. 143–146. https://doi.org/10.1109/SOCC.2009.5335660.

  29. Payá-Vayá, G., Martín-langerwerf, J., Pirsch, P. (2009). A multi-shared register file structure for VLIW processors. Journal of Signal Processing Systems, 58(2), 215–231. https://doi.org/10.1007/s11265-009-0355-2.

    Article  Google Scholar 

  30. Payá-Vayá, G, Martín-Langerwerf, J, Blume, H, Pirsch, P. (2010). A forwarding-sensitive instruction scheduling approach to reduce register file constraints in VLIW architectures. In: 21st IEEE international conference on application-specific systems, Architectures and processors (ASAP), IEEE. https://doi.org/10.1109/asap.2010.5541015.

  31. She, D, He, Y, Mesman, B, Corporaal, H. (2012). Scheduling for register file energy minimization in explicit datapath architectures. In: Proceedings of the IEEE design, automation & test in europe conference & exhibition (DATE), pp. 388–393 . https://doi.org/10.1109/date.2012.6176502.

  32. Su, X, Wu, H, Xue, J. (2017). An efficient WCET-aware instruction scheduling and register allocation approach for clustered VLIW processors. ACM Transactions on Embedded Computing Systems, 16(5s), 1–21. https://doi.org/10.1145/3126524.

    Article  Google Scholar 

  33. Wang, M, Wang, Y, Liu, D, Qin, Z, Shao, Z. (2010). Compiler-assisted leakage-aware loop scheduling for embedded VLIW DSP processors. Journal of Systems and Software, 83(5), 772–785. https://doi.org/10.1016/j.jss.2009.11.727.

    Article  Google Scholar 

  34. Wilken, K, Liu, J, Heffernan, M. (2000). Optimal instruction scheduling using integer programming. SIGPLAN Not, 35(5), 121–133. https://doi.org/10.1145/358438.349318.

    Article  Google Scholar 

  35. Xiao, S, & Lai, EMK. (2005). Instruction scheduling of VLIW architectures for balanced power consumption. In: Proceedings of the conference on Asia South Pacific design automation (ASP-DAC ’05), ACM Press, pp. 824–829. https://doi.org/10.1145/1120725.1121027.

  36. Xiao, S, & Lai, EMK. (2007). VLIW instruction scheduling for minimal power variation. ACM Transactions on Architecture and Code Optimization, 4(3), 18–es. https://doi.org/10.1145/1275937.1275942.

    Article  Google Scholar 

  37. Yun, HS, & Kim, J. (2001). Power-aware modulo scheduling for high-performance VLIW processors. In: ISLPED ’01: proceedings of the 2001 international symposium on low power electronics and design (IEEE Cat. No.01TH8581), ACM, pp. 40–45. https://doi.org/10.1109/lpe.2001.945369.

  38. Zeitlhofer, T, & Wess, B. (1999). Operation scheduling for parallel functional units using genetic algorithms. In: IEEE international conference on acoustics, speech, and signal processing. Proceedings. ICASSP99 (Cat. No.99CH36258), vol. 4, pp. 1997–2000. https://doi.org/10.1109/ICASSP.1999.758319.

  39. Zhong, S, Wei, J, Guo, W, Wang, Z. (2010). Instruction scheduling using genetic algorithm with taboo search for TTA-like processors. In: International conference on computer design and applications. IEEE. https://doi.org/10.1109/iccda.2010.5541389.

Download references

Acknowledgements

This work was partly funded by the German Research Council (DFG) under project number PA 2762/2-1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillermo Payá-Vayá.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Giesemann, F., Gerlach, L. & Payá-Vayá, G. Evolutionary Algorithms for Instruction Scheduling, Operation Merging, and Register Allocation in VLIW Compilers. J Sign Process Syst 92, 655–678 (2020). https://doi.org/10.1007/s11265-019-01493-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-019-01493-2

Keywords

Navigation