[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2038698.2038707acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

An efficient heuristic for instruction scheduling on clustered vliw processors

Published: 09 October 2011 Publication History

Abstract

Clustering is a well-known technique for improving the scalability of classical VLIW processors. A clustered VLIW processor consists of multiple clusters, each of which has its own register file and functional units. This paper presents a novel phase coupled priority-based heuristic for scheduling a set of instructions in a basic block on a clustered VLIW processor. Our heuristic converts the instruction scheduling problem into the problem of scheduling a set of instructions with a common deadline. The priority of each instruction vi is the lmax(vi)-successor-tree-consistent deadline which is the upper bound on the latest completion time of vi in any feasible schedule for a relaxed problem where the precedence-latency constraints between vi and all its successors, as well as the resource constraints are considered. We have simulated our heuristic, UAS heuristic and Integrated heuristic on the 808 basic blocks taken from the MediaBench II benchmark suite using six processor models. On average, for the six processor models, our heuristic improves 25%, 25%, 33%, 23%, 26%, 27% over UAS heuristic, respectively, and 15%, 16%, 15%, 9%, 20%, 8% over Integrated heuristic, respectively.

References

[1]
John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach. Elsevier, 114--120, fourth edition, 2006.
[2]
Andrei Terechko, Erwan Le Thenaff, Manish Garg, Jos van Eijndhoven, and Henk Corporaal. Inter-cluster communication models for clustered VLIW processors. In proceedings of Symposium on High Performance Computer Architectures, 2003.
[3]
E. Ozer, S. Banerjia, and T. M. Conte. Unified assign and schedule: A new approach to scheduling for clustered register file microarchitectures. In Proceedings of the 31st Annual International Symposium on Microarchitecture, 1998.
[4]
Rahul Nagpal and Y. N. Srikant. pragmatic integrated scheduling for clustered vliw architectures. software-practice and experience, 38:227--257, 2008.
[5]
Jeffrey D. Ullman. Complexity of Sequencing Problems. John Wiley and Sons, 1976.
[6]
John R. Ellis. Bulldog: A Compiler for VLIW Architectures. The MIT Press, 1986.
[7]
Saurabh Jang, Steve Carr, Philip Sweany, and Darla Kuras. A code generation framework for VLIW architectures with partitioned register banks. In proceedings of 3rd International Conference on Massively Parallel Computing Systems, 1998.
[8]
Victor S. Lapinskii and Margarida F. Jacome. cluster assignment for high-performace embedded VLIW processors. ACM transactions on design automation of electronic systems, 7(3):430--454, July 2002.
[9]
Rainer Leupers. Instruction scheduling for clustered VLIW DSPs. In proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2000.
[10]
Kailas K, Agrawala A, and Ebcioglu K. Cars: A new code generation framework for clustered ILP processors. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture, 2001.
[11]
Jesús Sánchez and Antonio Gonzálezor. Instruction scheduling for clustered VLIW architectures. In Proceedings of 13th International Symposium on System Synthesis, 2000.
[12]
Javier Zalamea, Josep Llosa, Eduard Ayguade, and Matoe Valero. Modulo scheduling with integrated register spilling for clustered VLIW architectures. In Proceedings of the 34th Annual International Symposium on Microarchitecture, pages 160--169, 2001.
[13]
Phillip B. Gibbons and Steven S. Muchnick. Efficient instruction scheduling for a pipelined architecture. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 1986.
[14]
Josep M. Codina, Jesús Sánchez, and Antonio González. A unified modulo scheduling and register allocation technique for clustered processors. In Proceedings of 2001 International Conference on Parallel Architecture and Compilation Techniques, 2001.
[15]
Yi Qian, Steve Carr, and Philip Sweany. optimizing loop performance for clustered vliw architectures. In Proceedings of 2002 International Conference on Parallel Architecture and Compilation Techniques, 2002.
[16]
Alex Aleta, Josep M. Codina, Jesús Sánchez, Antonio González, and David Kaeli. Agamos: A graph-based approach to modulo scheduling for clustered microarchitectures. IEEE Transactions on Computers, 58(6):770--783, 2009.
[17]
Mediabench ii benchmark. http://euler.slu.edu/~fritts/mediabench/.
[18]
TI tms320c64xx DSPs. http://www.ti.com.

Cited By

View all
  • (2024)Optimizing VLIW Instruction Scheduling via a Two-Dimensional Constrained Dynamic ProgrammingACM Transactions on Design Automation of Electronic Systems10.1145/364313529:5(1-20)Online publication date: 25-Jan-2024
  • (2024)A Graph Neural Network Approach to Improve List Scheduling Heuristics Under Register-Pressure2024 13th International Conference on Modern Circuits and Systems Technologies (MOCAST)10.1109/MOCAST61810.2024.10615463(01-06)Online publication date: 26-Jun-2024
  • (2017)An Efficient WCET-Aware Instruction Scheduling and Register Allocation Approach for Clustered VLIW ProcessorsACM Transactions on Embedded Computing Systems10.1145/312652416:5s(1-21)Online publication date: 27-Sep-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CASES '11: Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
October 2011
250 pages
ISBN:9781450307130
DOI:10.1145/2038698
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustered vliw processor
  2. instruction scheduling
  3. inter-cluster communication latency
  4. inter-instructional latency

Qualifiers

  • Research-article

Conference

ESWeek '11
ESWeek '11: Seventh Embedded Systems Week
October 9 - 14, 2011
Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Optimizing VLIW Instruction Scheduling via a Two-Dimensional Constrained Dynamic ProgrammingACM Transactions on Design Automation of Electronic Systems10.1145/364313529:5(1-20)Online publication date: 25-Jan-2024
  • (2024)A Graph Neural Network Approach to Improve List Scheduling Heuristics Under Register-Pressure2024 13th International Conference on Modern Circuits and Systems Technologies (MOCAST)10.1109/MOCAST61810.2024.10615463(01-06)Online publication date: 26-Jun-2024
  • (2017)An Efficient WCET-Aware Instruction Scheduling and Register Allocation Approach for Clustered VLIW ProcessorsACM Transactions on Embedded Computing Systems10.1145/312652416:5s(1-21)Online publication date: 27-Sep-2017
  • (2017)On Improving Performance and Energy Efficiency for Register-File Connected Clustered VLIW Architectures for Embedded System UsageThe Computer Journal10.1093/comjnl/bxx001Online publication date: 22-Jan-2017
  • (2014)Lifetime holes aware register allocation for clustered VLIW processorsProceedings of the conference on Design, Automation & Test in Europe10.5555/2616606.2616716(1-4)Online publication date: 24-Mar-2014
  • (2013)CAeSaRProceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems10.5555/2555729.2555738(1-10)Online publication date: 29-Sep-2013
  • (2013)Optimizing Instruction Scheduling and Register Allocation for Register‐File‐Connected Clustered VLIW ArchitecturesThe Scientific World Journal10.1155/2013/9130382013:1Online publication date: 18-Jul-2013
  • (2013)LUCASACM SIGPLAN Notices10.1145/2499369.246556548:5(45-54)Online publication date: 20-Jun-2013
  • (2013)LUCASProceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/2491899.2465565(45-54)Online publication date: 20-Jun-2013
  • (2013)LUCASProceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/2465554.2465565(45-54)Online publication date: 20-Jun-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media