[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Reconfiguration and Communication-Aware Task Scheduling for High-Performance Reconfigurable Computing

Published: 01 November 2010 Publication History

Abstract

High-performance reconfigurable computing involves acceleration of significant portions of an application using reconfigurable hardware. When the hardware tasks of an application cannot simultaneously fit in an FPGA, the task graph needs to be partitioned and scheduled into multiple FPGA configurations, in a way that minimizes the total execution time. This article proposes the Reduced Data Movement Scheduling (RDMS) algorithm that aims to improve the overall performance of hardware tasks by taking into account the reconfiguration time, data dependency between tasks, intertask communication as well as task resource utilization. The proposed algorithm uses the dynamic programming method. A mathematical analysis of the algorithm shows that the execution time would at most exceed the optimal solution by a factor of around 1.6, in the worst-case. Simulations on randomly generated task graphs indicate that RDMS algorithm can reduce interconfiguration communication time by 11% and 44% respectively, compared with two other approaches that consider data dependency and hardware resource utilization only. The practicality, as well as efficiency of the proposed algorithm over other approaches, is demonstrated by simulating a task graph from a real-life application - N-body simulation - along with constraints for bandwidth and FPGA parameters from existing high-performance reconfigurable computers. Experiments on SRC-6 are carried out to validate the approach.

References

[1]
Bazargan, K., Kastner, R., and Sarrafzadeh, M. 2000. Fast template placement for reconfigurable computing systems. IEEE Des. Test Comput. 17, 1, 68--83.
[2]
Brebner, G. and Diessel, O. 2001. Chip-based reconfigurable task management. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’01). 182--191.
[3]
Caprara, A. and Pferschy, U. 2004. Worst-case analysis of the subset sum algorithm for bin packing. Oper. Res. Lett. 32, 20, 159--166.
[4]
Coffman, Jr., E. G., Garey, M. R., and Johnson, D. S. 1996. Approximation algorithms for bin packing: a survey. In Approximation Algorithms for NP-Hard Problems. D. Hochbaum Ed., PWS Publishing, Boston. 46--93.
[5]
Compton, K., Li, Z., Cooley, J., Knol, S., and Hauck, S. 2002. Configuration relocation and defragmentation for run-time reconfigurable computing. IEEE Trans. VLSI Syst. 10, 3, 209--220.
[6]
Diessel, O., ElGindy, H., Middendorf, M., Schmeck, H., and Schmidt, B. 2000. Dynamic scheduling of tasks on partially reconfigurable FPGAs. IEE Proc. Comput. Digital Techniq. (Special Issue on Reconfigurable Systems) 147, 3, 181--188.
[7]
Fekete, S. P., Köhler, E., and Teich, J. 2001. Optimal FPGA module placement with temporal precedence constraints. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’01). 658--665.
[8]
Govindu, G., Scrofano, R., and Prasanna, V. K. 2005. A library of parameterizable floating-point cores for FPGAs and their application to scientific computing. In Proceedings of the International Conference on Engineering Reconfigurable Systems and Algorithms (ERSA’05). 137--145.
[9]
Handa, M. and Vemuri, R. 2004. A fast algorithm for finding maximal empty rectangles for dynamic FPGA placement. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’04). Vol. 1. 744--745.
[10]
Hemmert, K. S. and Underwood, K. D. 2006. Open source high performance floating-point modules. In Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’06). 349--350.
[11]
Huang, M., Simmler, H., Saha, P., and El-Ghazawi, T. 2008. Hardware task scheduling optimizations for reconfigurable computing. In Proceedings of the 2nd International Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA’08).
[12]
Huang, M., Simmler, H., Serres, O., and El-Ghazawi, T. 2009. RDMS: A hardware task scheduling algorithm for reconfigurable computing. In Proceedings of the 16th Reconfigurable Architectures Workshop (RAW’09).
[13]
Kellerer, H., Pferschy, U., and Pisinger, D. 2004. Knapsack Problems. Springer, Berlin.
[14]
Kleinberg, J. and Tardos, É. 2005. Algorithm Design. Pearson/Addison-Wesley, Boston, MA.
[15]
Lienhart, G., Kugel, A., and Männer, R. 2002. Using floating-point arithmetic on FPGAs to accelerate scientific N-body simulations. In Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’02). 182--191.
[16]
Lucy, L. B. 1977. A numerical approach to the testing of the fission hypothesis. Astronom. J. 82, 12, 1013--1024.
[17]
Monaghan, J. J. and Lattanzio, J. C. 1985. A refined particle method for astrophysical problems. Astron. Astrophys. 149, 135--143.
[18]
Pisinger, D. 1999. Linear time algorithms for knapsack problems with bounded weights. J. Algor. 33, 1, 1--14.
[19]
Saha, P. 2007. Automatic software hardware co-design for reconfigurable computing systems. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’07). 507--508.
[20]
Thakkar, A. J. and Ejnioui, A. 2006. Design and implementation of double precision floating point division and square root on FPGAs. In Proceedings of the IEEE Aerospace Conference.
[21]
Walder, H. and Platzner, M. 2002. Non-preemptive multitasking on fpga: Task placement and footprint transform. In Proceedings of the 2nd International Conference on Engineering of Reconfigurable Systems and Architectures (ERSA). 24--30.
[22]
Walder, H., Steiger, C., and Platzner, M. 2003. Fast online task placement on FPGAs: free space partitioning and 2D-hashing. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS’03). 178--185.
[23]
Wiangtong, T., Cheung, P., and Luk, W. 2003. Multitasking in hardware-software codesign for reconfigurable computer. In Proceedings of the International Symposium on Circuits and Systems (ISCAS’03). Vol. 5. 621--624.
[24]
Zhuo, L. and Prasanna, V. K. 2007. Scalable and modular algorithms for floating-point matrix multiplication on reconfigurable computing systems. IEEE Trans. Para. Distrib. Syst. 18, 4, 433--448.

Cited By

View all
  • (2024)Trends, Approaches, and Gaps in Scientific Workflow Scheduling: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.350921812(182203-182231)Online publication date: 2024
  • (2021)A Survey: FPGA‐Based Dynamic Scheduling of Hardware TasksChinese Journal of Electronics10.1049/cje.2021.07.02130:6(991-1007)Online publication date: Nov-2021
  • (2019)Using the loop chain abstraction to schedule across loops in existing codeInternational Journal of High Performance Computing and Networking10.5555/3302714.330272013:1(86-104)Online publication date: 1-Jan-2019
  • Show More Cited By

Index Terms

  1. Reconfiguration and Communication-Aware Task Scheduling for High-Performance Reconfigurable Computing

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Reconfigurable Technology and Systems
        ACM Transactions on Reconfigurable Technology and Systems  Volume 3, Issue 4
        November 2010
        240 pages
        ISSN:1936-7406
        EISSN:1936-7414
        DOI:10.1145/1862648
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 01 November 2010
        Accepted: 01 August 2009
        Revised: 01 July 2009
        Received: 01 March 2009
        Published in TRETS Volume 3, Issue 4

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Hardware task scheduling
        2. reconfigurable computing

        Qualifiers

        • Research-article
        • Research
        • Refereed

        Funding Sources

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)14
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 18 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Trends, Approaches, and Gaps in Scientific Workflow Scheduling: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.350921812(182203-182231)Online publication date: 2024
        • (2021)A Survey: FPGA‐Based Dynamic Scheduling of Hardware TasksChinese Journal of Electronics10.1049/cje.2021.07.02130:6(991-1007)Online publication date: Nov-2021
        • (2019)Using the loop chain abstraction to schedule across loops in existing codeInternational Journal of High Performance Computing and Networking10.5555/3302714.330272013:1(86-104)Online publication date: 1-Jan-2019
        • (2017)DTPProceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies10.1145/3120895.3120901(1-11)Online publication date: 7-Jun-2017
        • (2017)A Clustering Algorithm for Communication-Aware Scheduling of Task Graphs on Multi-Core Reconfigurable SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.270312328:10(2718-2732)Online publication date: 7-Sep-2017
        • (2017)A Floorplanning Algorithm for Partially Reconfigurable FPGA in Wireless Sensor NetworkSecurity, Privacy, and Anonymity in Computation, Communication, and Storage10.1007/978-3-319-72395-2_60(667-679)Online publication date: 9-Dec-2017
        • (2016)Identifying and scheduling loop chains using directivesProceedings of the Third International Workshop on Accelerator Programming Using Directives10.5555/3019120.3019126(57-67)Online publication date: 13-Nov-2016
        • (2016)Identifying and Scheduling Loop Chains Using Directives2016 Third Workshop on Accelerator Programming Using Directives (WACCPD)10.1109/WACCPD.2016.010(57-67)Online publication date: Nov-2016
        • (2015)Performance-Oriented Partitioning for Task Scheduling of Parallel Reconfigurable ArchitecturesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.231292426:3(858-867)Online publication date: Mar-2015
        • (2013)Metrics for Early-Stage Modeling of Many-Accelerator ArchitecturesIEEE Computer Architecture Letters10.1109/L-CA.2012.912:1(25-28)Online publication date: 1-Jan-2013
        • Show More Cited By

        View Options

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media