More Web Proxy on the site http://driver.im/

research-article

A general constraint-centric scheduling framework for spatial architectures

Authors:

Michael Sartin-Tarm,

Lorenzo De Carli,

Karthikeyan Sankaralingam,

Cristian Estan,

Behnam RobatmiliAuthors Info & Claims

ACM SIGPLAN Notices, Volume 48, Issue 6

Pages 495 - 506

https://doi.org/10.1145/2499370.2462163

Published: 16 June 2013 Publication History

Abstract

Specialized execution using spatial architectures provides energy efficient computation, but requires effective algorithms for spatially scheduling the computation. Generally, this has been solved with architecture-specific heuristics, an approach which suffers from poor compiler/architect productivity, lack of insight on optimality, and inhibits migration of techniques between architectures.

Our goal is to develop a scheduling framework usable for all spatial architectures. To this end, we expresses spatial scheduling as a constraint satisfaction problem using Integer Linear Programming (ILP). We observe that architecture primitives and scheduler responsibilities can be related through five abstractions: placement of computation, routing of data, managing event timing, managing resource utilization, and forming the optimization objectives. We encode these responsibilities as 20 general ILP constraints, which are used to create schedulers for the disparate TRIPS, DySER, and PLUG architectures. Our results show that a general declarative approach using ILP is implementable, practical, and typically matches or outperforms specialized schedulers.

References

[1]

Trips toolchain, http://www.cs.utexas.edu/ trips/dist/.

[2]

A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools.

[3]

S. Amarasinghe, D. R. Karger, W. Lee, and V. S. Mirrokni. A theoretical and practical approach to instruction scheduling on spatial architectures. Technical report, MIT, 2002.

[4]

S. Amellal and B. Kaminska. Functional synthesis of digital systems with tass. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 13(5):537--552, 1994.

Digital Library

[5]

C. Ancourt and F. Irigoin. Scanning polyhedra with do loops. In PPOPP 1991.

Digital Library

[6]

O. Azizi, A. Mahesri, B. C. Lee, S. J. Patel, and M. Horowitz. Energyperformance tradeoffs in processor architecture and circuit design: a marginal cost analysis. In ISCA 2010.

Digital Library

[7]

S. S. Battacharyya, E. A. Lee, and P. K. Murthy. Software Synthesis from Dataflow Graphs. Kluwer Academic Publishers, 1996.

Digital Library

[8]

S. Borkar and A. A. Chien. The future of microprocessors. Commun. ACM, 54(5):67--77, 2011.

Digital Library

[9]

D. Burger, S. W. Keckler, K. S. McKinley, M. Dahlin, L. K. John, C. Lin, C. R. Moore, J. Burrill, R. G. McDonald, W. Yoder, and the TRIPS Team. Scaling to the end of silicon with EDGE architectures. IEEE Computer, 37(7):44--55, 2004.

Digital Library

[10]

N. Clark, M. Kudlur, H. Park, S. Mahlke, and K. Flautner. Applicationspecific processing on a general-purpose core via transparent instruction set customization. In MICRO 2004.

Digital Library

[11]

J. Cong, K. Gururaj, G. Han, and W. Jiang. Synthesis algorithm for application-specific homogeneous processor networks. IEEE Trans. Very Large Scale Integr. Syst., 17(9), Sept. 2009.

Digital Library

[12]

K. Coons, X. Chen, S. Kushwaha, K. S. McKinley, and D. Burger. A Spatial Path Scheduling Algorithm for EDGE Architectures. In ASPLOS 2006.

Digital Library

[13]

L. De Carli, Y. Pan, A. Kumar, C. Estan, and K. Sankaralingam. Plug: Flexible lookup modules for rapid deployment of new protocols in high-speed routers. In SIGCOMM 2009.

Digital Library

[14]

L. de Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, 2008.

Digital Library

[15]

A. Deb, J. M. Codina, and A. Gonzales. Softhv: A hw/sw co-designed processor with horizontal and vertical fusion. In International Conference on Computing Frontiers 2011.

Digital Library

[16]

A. E. Eichenberger and E. S. Davidson. Efficient formulation for optimal modulo schedulers. In PLDI 1997.

Digital Library

[17]

J. R. Ellis. Bulldog: a compiler for vliw architectures. PhD thesis, 1985.

Digital Library

[18]

D. W. Engels, J. Feldman, D. R. Karger, and M. Ruhl. Parallel processor scheduling with delay constraints. In SODA 2001.

Digital Library

[19]

H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, and D. Burger. Dark Silicon and the End of Multicore Scaling. In ISCA 2011.

Digital Library

[20]

H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger. Neural acceleration for general-purpose approximate programs. In MICRO 2012.

Digital Library

[21]

K. Fan, H. h. Park, M. Kudlur, and S. o. Mahlke. Modulo scheduling for highly customized datapaths to increase hardware reusability. In CGO 2008.

Digital Library

[22]

P. Feautrier. Some efficient solutions to the affine scheduling problem. International Journal of Parallel Programming, 21:313--347, 1992.

Digital Library

[23]

M. Gebhart, B. A. Maher, K. E. Coons, J. Diamond, P. Gratz, M. Marino, N. Ranganathan, B. Robatmili, A. Smith, J. Burrill, S. W. Keckler, D. Burger, and K. S. McKinley. An evaluation of the trips computer system. In ASPLOS 2009.

Digital Library

[24]

G. J. Gordon, S. A. Hong, and M. Dudík. First-order mixed integer linear programming. In UAI 2009.

Digital Library

[25]

V. Govindaraju, C.-H. Ho, T. Nowatzki, J. Chhugani, N. Satish, K. Sankaralingam, and C. Kim. Dyser: Unifying functionality and parallelism specialization for energy efficient computing. IEEE Micro, 33(5), 2012.

Digital Library

[26]

V. Govindaraju, C.-H. Ho, and K. Sankaralingam. Dynamically specialized datapaths for energy efficient computing. In HPCA 2011.

Digital Library

[27]

S. Gupta, S. Feng, A. Ansari, S. Mahlke, and D. August. Bundled execution of recurring traces for energy-efficient general purpose processing. In MICRO 2011.

Digital Library

[28]

N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki. Toward dark silicon in servers. IEEE Micro, 31(4):6--15, 2011.

Digital Library

[29]

J. N. Hooker. Logic, optimization and constraint programming. INFORMS Journal on Computing, 14:295--321, 2002.

Digital Library

[30]

J. N. Hooker and M. A. Osorio. Mixed logical-linear programming. Discrete Appl. Math., 96-97(1), Oct. 1999.

Digital Library

[31]

Z. Huang, S. Malik, N. Moreano, and G. Araujo. The design of dynamically reconfigurable datapath coprocessors. ACM Trans. Embed. Comput. Syst., 3(2):361--384, May 2004.

Digital Library

[32]

R. Joshi, G. Nelson, and K. Randall. Denali: a goal-directed superoptimizer. In PLDI 2002.

Digital Library

[33]

K. Kailas and A. Agrawala. Cars: A new code generation framework for clustered ilp processors. In HPCA 2001.

Digital Library

[34]

M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In PLDI 2008.

Digital Library

[35]

A. Kumar, L. De Carli, S. J. Kim, M. de Kruijf, K. Sankaralingam, C. Estan, and S. Jha. Design and implementation of the plug architecture for programmable and efficient network lookups. In PACT 2010.

Digital Library

[36]

W. Lee, R. Barua, M. Frank, D. Srikrishna, J. Babb, V. Sarkar, and S. Amarasinghe. Space-time scheduling of instruction-level parallelism on a raw machine. In ASPLOS 1998.

Digital Library

[37]

M. Mercaldi, S. Swanson, A. Petersen, A. Putnam, A. Schwerin, M. Oskin, and S. J. Eggers. Instruction scheduling for a tiled dataflow architecture. In ASPLOS 2006.

Digital Library

[38]

M. Mercaldi, S. Swanson, A. Petersen, A. Putnam, A. Schwerin, M. Oskin, and S. J. Eggers. Modeling instruction placement on a spatial architecture. In SPAA 2006.

Digital Library

[39]

M. Mishra, T. J. Callahan, T. Chelcea, G. Venkataramani, M. Budiu, and S. C. Goldstein. Tartan: Evaluating spatial computation for whole program execution. In ASPLOS 2006.

Digital Library

[40]

R. Nagarajan, S. K. Kushwaha, D. Burger, K. S. McKinley, C. Lin, and S. W. Keckler. Static placement, dynamic issue (spdi) scheduling for edge architectures. In PACT 2004.

Digital Library

[41]

E. Özer, S. Banerjia, and T. M. Conte. Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures. In MICRO 31.

[42]

J. Palsberg and M. Naik. Ilp-based resource-aware compilation, 2004.

[43]

H. Park, K. Fan, S. A. Mahlke, T. Oh, H. Kim, and H.-s. Kim. Edgecentric modulo scheduling for coarse-grained reconfigurable architectures. In PACT 2008.

Digital Library

[44]

W. Pugh. The omega test: a fast and practical integer programming algorithm for dependence analysis. In Supercomputing 1991.

Digital Library

[45]

N. Satish, K. Ravindran, and K. Keutzer. A decomposition-based constraint optimization approach for statically scheduling task graphs with communication delays to multiprocessors. In DATE 2007.

Digital Library

[46]

S. Swanson, K. Michelson, A. Schwerin, and M. Oskin. Wavescalar. In MICRO 2003.

Digital Library

[47]

M. Thuresson, M. Sjalander, M. Bjork, L. Svensson, P. Larsson- Edefors, and P. Stenstrom. Flexcore: Utilizing exposed datapath control for efficient computing. In IC-SAMOS 2007.

[48]

G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. Conservation cores: reducing the energy of mature computations. In ASPLOS 2010.

Digital Library

[49]

H. M. Wagner. An integer linear-programming model for machine scheduling. Naval Research Logistics Quarterly, 6(2):131--140, 1959.

[50]

E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring It All to Software: RAW Machines. Computer, 30(9):86--93, 1997.

Digital Library

[51]

M. Watkins, M. Cianchetti, and D. Albonesi. Shared reconfigurable architectures for cmps. In FPGA 2008.

[52]

L. A. Wolsey and G. L. Nemhauser. Integer and Combinatorial Optimization.

Digital Library

Cited By

Melchert JMei YKoul KLiu QHorowitz MRaina P(2024)Cascade: An Application Pipelining Toolkit for Coarse-Grained Reconfigurable ArraysIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.339054243:10(3055-3067)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TCAD.2024.3390542
Lu LLuo ZZheng SYin JCong JLiang YYin J(2024)Rubick: A Unified Infrastructure for Analyzing, Exploring, and Implementing Spatial Architectures via Dataflow DecompositionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333720843:4(1177-1190)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TCAD.2023.3337208
Ujjainkar NLeng JZhu YSolihin YHeinrich M(2023)ImaGen: A General Framework for Generating Memory- and Power-Efficient Image Processing AcceleratorsProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589076(1-13)Online publication date: 17-Jun-2023
https://dl.acm.org/doi/10.1145/3579371.3589076
Show More Cited By

Index Terms

A general constraint-centric scheduling framework for spatial architectures

Recommendations

A general constraint-centric scheduling framework for spatial architectures
PLDI '13: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation

Specialized execution using spatial architectures provides energy efficient computation, but requires effective algorithms for spatially scheduling the computation. Generally, this has been solved with architecture-specific heuristics, an approach which ...
A Scheduling Framework for Spatial Architectures Across Multiple Constraint-Solving Theories

Spatial architectures provide energy-efficient computation but require effective scheduling algorithms. Existing heuristic-based approaches offer low compiler/architect productivity, little optimality insight, and low architectural portability.

We seek ...
A Heuristic Ceiling Point Algorithm for General Integer Linear Programming

This paper first examines the role of ceiling points in solving a pure, general integer linear programming problem P. Several kinds of ceiling points are defined and analyzed and one kind called "feasible 1-ceiling points" proves to be of special ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 48, Issue 6

PLDI '13

June 2013

515 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/2499370

Issue’s Table of Contents

PLDI '13: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2013
546 pages
ISBN:9781450320146
DOI:10.1145/2491956
General Chair:
Hans-J. Boehm
HP Labs
,
Program Chair:
Cormac Flanagan
University of California, Santa Cruz

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 June 2013

Published in SIGPLAN Volume 48, Issue 6

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

69
Total Citations
View Citations
911
Total Downloads

Downloads (Last 12 months)91
Downloads (Last 6 weeks)10

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Melchert JMei YKoul KLiu QHorowitz MRaina P(2024)Cascade: An Application Pipelining Toolkit for Coarse-Grained Reconfigurable ArraysIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.339054243:10(3055-3067)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TCAD.2024.3390542
Lu LLuo ZZheng SYin JCong JLiang YYin J(2024)Rubick: A Unified Infrastructure for Analyzing, Exploring, and Implementing Spatial Architectures via Dataflow DecompositionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333720843:4(1177-1190)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TCAD.2023.3337208
Ujjainkar NLeng JZhu YSolihin YHeinrich M(2023)ImaGen: A General Framework for Generating Memory- and Power-Efficient Image Processing AcceleratorsProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589076(1-13)Online publication date: 17-Jun-2023
https://dl.acm.org/doi/10.1145/3579371.3589076
Liu TLi WFan Z(2023)DFGC: DFG-aware NoC Control based on Time Stamp Prediction for Dataflow Architecture2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00071(432-439)Online publication date: 6-Nov-2023
https://doi.org/10.1109/ICCD58817.2023.00071
B. Khan MR. Khan AAlkahtani H(2022)Exploring the Approaches to Data Flow ComputingComputers, Materials & Continua10.32604/cmc.2022.02062371:2(2333-2346)Online publication date: 2022
https://doi.org/10.32604/cmc.2022.020623
Man XZhu JSong GYin SWei SLiu LSalapura VZahran MChong FTang L(2022)CaSMapProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527426(259-273)Online publication date: 18-Jun-2022
https://dl.acm.org/doi/10.1145/3470496.3527426
Martin K(2022)Twenty Years of Automated Methods for Mapping Applications on CGRA2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW55747.2022.00118(679-686)Online publication date: May-2022
https://doi.org/10.1109/IPDPSW55747.2022.00118
Feng YLi DTan XYe XFan DLi WWang DZhang HTang Z(2022)Accelerating Data Transfer in Dataflow Architectures Through a Look-Ahead Acknowledgment MechanismJournal of Computer Science and Technology10.1007/s11390-020-0555-637:4(942-959)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1007/s11390-020-0555-6
Reshadi MMirmahaleh S(2022)Mapping and virtual neuron assignment algorithms for MAERI acceleratorThe Journal of Supercomputing10.1007/s11227-021-03893-378:1(238-257)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1007/s11227-021-03893-3
Wei SLiu LZhu JDeng CWei SLiu LZhu JDeng C(2022)Compilation SystemSoftware Defined Chips10.1007/978-981-19-6994-2_4(197-311)Online publication date: 21-Oct-2022
https://doi.org/10.1007/978-981-19-6994-2_4
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents