More Web Proxy on the site http://driver.im/

research-article

Modeling and Analyzing Dataflow Applications on NoC-Based Many-Core Architectures

Authors:

Jean-Vivien Millo,

Emilien Kofman,

Robert De SimoneAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 14, Issue 3

Article No.: 46, Pages 1 - 25

https://doi.org/10.1145/2700081

Published: 21 April 2015 Publication History

Abstract

The advent of chip-level parallel architectures prompted a renewal of interest into dataflow process networks. The trend is to model an application independently from the architecture, then the model is morphed to best fit the target architecture. One downplayed aspect is the mapping of communications through the on-chip topology. The cost of such communications is often prevalent with regard to computations.

This article establishes a dataflow process network called K-periodically Routed Graph (KRG), which serves the role of representing the various routing decisions during the transformation of a genuine application into a architecture-aware version for this application.

References

[1]

Marco Aldinucci, Marco Danelutto, Peter Kilpatrick, and Massimo Torquati. 2013. FastFlow: High-level and efficient streaming on multi-core. In Programming Multi-Core and Many-Core Computing Systems, S. Pllana and F. Xhafa (Eds.). Wiley.

[2]

Randy Allen and Ken Kennedy. 1984. Automatic loop interchange (with retrospective). In Best of PLDI, Kathryn S. McKinley (Ed.). ACM, New York, NY, 75--90.

[3]

Eitan Altman, Bruno Gaujal, and Arie Hordijk. 2000. Balanced sequences and optimal routing. Journal of the ACM 47, 4, 752--775.

Digital Library

[4]

Luca Benini, Eric Flamand, Didier Fuin, and Diego Melpignano. 2012. P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator. In Proceedings of the Design, Automation, and Test in Europe Conference Exhibition (DATE). 983--987.

Digital Library

[5]

Shuvra S. Bhattacharyya, Edward A. Lee, and Praveen K. Murthy. 1996. Software Synthesis from Dataflow Graphs. Kluwer Academic Publishers, Norwell, MA.

Digital Library

[6]

Greet Bilsen, Marc Engels, Rudy Lauwereins, and Jean A. Peperstraete. 1995. Cyclo-static dataflow. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP-95), Vol. 5. 3255--3258.

[7]

Julien Boucaron, Jean-Vivien Millo, and Robert De Simone. 2006. Latency-insensitive design and central repetitive scheduling. In Proceedings of the 4th ACM and IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE’06). IEEE, Los Alamitos, CA, 175--183.

Digital Library

[8]

Florian Brandner and Martin Schoeberl. 2012. Static routing in symmetric real-time network-on-chips. In Proceedings of the 20th International Conference on Real-Time and Network Systems (RTNS’12). ACM, New York, NY, 61--70.

Digital Library

[9]

David Broman, Michael Zimmer, Yooseong Kim, Hokeun Kim, Jian Cai, Aviral Shrivastava, Stephen A. Edwards, and Edward A. Lee. 2013. Precision timed infrastructure: Design challenges. In Proceedings of the Electronic System Level Synthesis Conference (ESLsyn’13). 1--6. http://chess.eecs.berkeley.edu/pubs/993.html.

[10]

Joseph T. Buck. 1993. Scheduling Dynamic Dataflow Graphs with Bounded Memory Using the Token Flow Model. Ph.D. Dissertation. University of California, Berkeley.

Digital Library

[11]

José Cano, José Flich, José Duato, Marcello Coppola, and Riccardo Locatelli. 2011. Efficient routing implementation in complex systems-on-chip designs. In Proceedings of the 5th ACM/IEEE International Symposium on Networks-on-Chip (NOCS’11). ACM, New York, NY, 1--8.

Digital Library

[12]

Rohit Chandra, Leonardo Dagun, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon. 2001. Parallel programming in OpenMP. Morgan Kaufmann, San Francisco, CA. http://opac.inria.fr/record=b1101261.

Digital Library

[13]

Piotr Chrzastowski-Wachtel and Marek Raczunas. 1993. Liveness of weighted circuits and the Diophantine problem of Frobenius. In Fundamentals of Computation Theory. Springer, 171--180.

Digital Library

[14]

Anthony Coadou. 2010. Réseaux de processus flots de données avec routage pour la modélisation de systèmes embarqués. Ph.D. Dissertation. University of Nice Sophia Antipolis.

[15]

Albert Cohen, Marc Duranton, Christine Eisenbeis, Claire Pagetti, Florence Plateau, and Marc Pouzet. 2006. N-synchronous Kahn networks: A relaxed model of synchrony for real-time systems. In Proceedings of POPL’06: Conference Record of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, 180--193.

Digital Library

[16]

Frederic Commoner, Anatol W. Holt, Shimon Even, and Amir Pnueli. 1971. Marked directed graph. Journal of Computer and System Sciences 5, 511--523.

Digital Library

[17]

Michel Cosnard and Denis Trystram. 1993. Algorithmes et architectures parallèles. InterEditions, Paris. http://opac.inria.fr/record=b1077080.

[18]

Loïc Cudennec and Renaud Sirdey. 2012. Parallelism reduction based on pattern substitution in dataflow oriented programming languages. Procedia Computer Science 9, 146--155.

[19]

Giovanni de Micheli and Luca Benini. 2006. Networks on Chips. Morgan Kauffmann (Elsevier).

[20]

Jean de Rumeur. 1994. Communication dans les réseaux de processeurs. Masson, Paris, France.

[21]

Manel Djemal, Francois Pecheux, Dumitru Potop-Butucaru, Robert de Simone, Franck Wajsburt, and Zhen Zhang. 2012. Programmable routers for efficient mapping of applications onto NoC-based MPSoCs. In Proceedings of the Conference on Design and Architectures for Signal and Image Processing (DASIP’12). 1--8.

[22]

Paul Feautrier. 1992a. Some efficient solutions to the affine scheduling problem. I. One-dimensional time. International Journal of Parallel Programming 21, 5, 313--347.

Digital Library

[23]

Paul Feautrier. 1992b. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. International Journal of Parallel Programming 21, 6, 389--420.

Digital Library

[24]

Pascal Fradet, Alain Girault, and Peter Poplavkoy. 2012. SPDF: A schedulable parametric data-flow MoC. In DATE, W. Rosenstiel and L. Thiele (Eds.). IEEE, Los Alamitos, CA, 769--774.

Digital Library

[25]

Kees Goossens and Andreas Hansson. 2010. The aethereal network on chip after ten years: Goals, evolution, lessons, and future. In Proceedings of the 47th ACM/IEEE Design Automation Conference (DAC’10). 306--311.

Digital Library

[26]

Michael I. Gordon. 2010. Compiler Techniques for Scalable Performance of Stream Programs on Multicore Architectures. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.

Digital Library

[27]

Juraj Hromkovič, Ralf Klasing, Andrzej Pelc, Peter Ružička, and Walter Unger. 2005. Dissemination of Information in Communication Networks: Part I. Broadcasting, Gossiping, Leader Election, and Fault-Tolerance. Springer-Verlag.

Digital Library

[28]

Gilles Kahn. 1974. The semantics of a simple language for parallel programming. In Information Processing 74: Proceedings of the IFIP Congress 74. 471--475.

[29]

Kalray. 2012. MPPA Manycore. Retrieved March 18, 2015, from http://www.kalray.eu/products/mppa-manycore.

[30]

Michal Karczmarek, William Thies, and Saman Amarasinghe. 2003. Phased scheduling of stream programs. In Proceedings of the 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). ACM, New York, NY, 103--112.

Digital Library

[31]

Richard M. Karp, Raymond E. Miller, and Shmuel Winograd. 1967. The organization of computations for uniform recurrence equations. Journal of the ACM 14, 3, 563--590.

Digital Library

[32]

Bart Kienhuis, Edwin Rijpkema, and Ed Deprettere. 2000. Compaan: Deriving process networks from Matlab for embedded signal processing architectures. In Proceedings of the 8th International Workshop on Hardware/Software Codesign (CODES’00). 13--17.

Digital Library

[33]

Hermann Kopetz and Günther Bauer. 2003. The time-triggered architecture. Proceedings of the IEEE 91, 1, 112--126.

[34]

Leslie Lamport. 1974. The parallel execution of DO loops. Communications of the ACM 17, 2, 83--93.

Digital Library

[35]

Edward A Lee. 2006. The problem with threads. Computer 39, 5, 33--42.

Digital Library

[36]

Edward A. Lee and David G. Messerschmitt. 1987a. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Transactions on Computers C-36, 1, 24--35.

Digital Library

[37]

Edward A. Lee and David G. Messerschmitt. 1987b. Synchronous data flow. Proceeding of the IEEE 75, 9, 1235--1245.

[38]

F. Thomson Leighton. 1992. Introduction to Parallel Algorithms and Architectures: Array, Trees, Hypercubes. Morgan Kaufmann, San Francisco, CA.

Digital Library

[39]

Diego Melpignano, Luca Benini, Eric Flamand, Bruno Jego, Thierry Lepley, Germain Haugou, Fabien Clermidy, and Denis Dutoit. 2012. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications. In Proceedings of the 49th Annual Design Automation Conference (DAC’12). 1137--1142.

Digital Library

[40]

Jean-Vivien Millo and Robert Simone. 2013. Explicit routing schemes for implementation of cellular automata on processor arrays. Natural Computing 12, 3, 353--368.

Digital Library

[41]

Robin Milner. 1982. A Calculus of Communicating Systems. Springer-Verlag, New York, NY.

Digital Library

[42]

Hristo Nikolov, Todor Stefanov, and Ed Deprettere. 2008. Systematic and automated multiprocessor system design, programming, and implementation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 27, 3, 542--555.

Digital Library

[43]

Thomas Parks. 1995. Bounded Scheduling of Process Networks. Ph.D. Dissertation. Department of EECS, University of California, Berkeley.

Digital Library

[44]

Carl A. Petri. 1962. Kommunikation mit Automaten. Ph.D. Dissertation. Technische Universitat Darmstadt, Germany.

[45]

Ville Rantala, Teijo Lehtonen, and Juha Plosila. 2006. Network on Chip Routing Algorithms. Turku Centre for Computer Science.

[46]

Kaushik Ravindran, Arkadeb Ghosal, Rhishikesh Limaye, Guoqiang Wang, Guang Yang, and Hugo Andrade. 2012. Analysis techniques for static dataflow models with access patterns. In Proceedings of the Conference on Design and Architectures for Signal and Image Processing (DASIP’12). 1--8.

[47]

Faizal A. Samman, Thomas Hollstein, and Mandfred Glesner. 2008. Multicast parallel pipeline router architecture for network-on-chip. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’08). 1396--1401.

Digital Library

[48]

Bart D. Theelen, Marc C. W. Geilen, Twan Basten, Jeroen P. M. Voeten, Stefan V. Gheorghita, and Sander Stuijk. 2006. A scenario-aware data flow model for combined long-run average and worst-case performance analysis. In Proceedings of the 4th IEEE/ACM International Conference on Formal Methods and Models for Co-Design (MEMOCODE’06). 185--194.

Digital Library

[49]

Sven Verdoolaege. 2013. Polyhedral process networks. In Handbook of Signal Processing Systems, S. S. Bhattacharyya, E. F. Deprettere, R. Leupers, and J. Takala (Eds.). Springer, New York, NY, 1335--1375.

[50]

Miao Wang and François Bodin. 2011. Compiler-directed memory management for heterogeneous MPSoCs. Journal of Systems Architecture 57, 1, 134--145.

Digital Library

[51]

David Whelihan. 2013. NoCsim. Retrieved March 18, 2015, from http://nocsim.sourceforge.net/.

[52]

Maarten H. Wiggers, Marco J. G. Bekooij, and Gerard J. M. Smit. 2008. Buffer capacity computation for throughput constrained streaming applications with data-dependent inter-task communication. In Proceedings of the Real-Time and Embedded Technology and Applications Symposium (RTAS’08). IEEE, Los Alamitos, CA, 183--194.

Digital Library

Cited By

Kofman Ede Simone RTalpin J(2016)A formal approach to the mapping of tasks on an heterogenous multicore, energy-aware architectureProceedings of the 14th ACM-IEEE International Conference on Formal Methods and Models for System Design10.5555/3343414.3343436(153-162)Online publication date: 18-Nov-2016
https://dl.acm.org/doi/10.5555/3343414.3343436
Martin KRizk MSepulveda MDiguet J(2016)Notifying memoriesProceedings of the 53rd Annual Design Automation Conference10.1145/2897937.2898051(1-6)Online publication date: 5-Jun-2016
https://dl.acm.org/doi/10.1145/2897937.2898051
Kofman Ede Simone R(2016)A formal approach to the mapping of tasks on an heterogenous multicore, energy-aware architecture2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)10.1109/MEMCOD.2016.7797760(153-162)Online publication date: Nov-2016
https://doi.org/10.1109/MEMCOD.2016.7797760

Index Terms

Modeling and Analyzing Dataflow Applications on NoC-Based Many-Core Architectures
1. Theory of computation
  1. Logic
  2. Models of computation
    1. Abstract machines

Recommendations

3D NOC for many-core processors

With an increasing number of processors forming many-core chip multiprocessors (CMP), there exists a need for easily scalable, high-performance and low-power intra-chip communication infrastructure for emerging systems. In CMPs with hundreds of ...
A routing-table-based adaptive and minimal routing scheme on network-on-chip architectures

In this paper, we present a routing algorithm that combines the shortest path routing and adaptive routing schemes for NoCs. In specific, routing follows the shortest path to ensure low latency and low energy consumption. This routing scheme requires ...
Master-based routing algorithm and communication-based cluster topology for 2D NoC

As size of chip is becoming smaller with growth in technology, and due to increase in number of cores, system-on-chip (SoC) becomes very complex. Network-on-chip (NoC) provides best solution to SoC by reducing communication overhead. The basic concern ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 14, Issue 3

Special Issue on Embedded Platforms for Crypto and Regular Papers

May 2015

515 pages

ISSN:1539-9087

EISSN:1558-3465

DOI:10.1145/2764962

Editor:
Sandeep K. Shukla
Virginia Tech, USA

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 21 April 2015

Accepted: 01 November 2014

Revised: 01 November 2014

Received: 01 April 2014

Published in TECS Volume 14, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
230
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kofman Ede Simone RTalpin J(2016)A formal approach to the mapping of tasks on an heterogenous multicore, energy-aware architectureProceedings of the 14th ACM-IEEE International Conference on Formal Methods and Models for System Design10.5555/3343414.3343436(153-162)Online publication date: 18-Nov-2016
https://dl.acm.org/doi/10.5555/3343414.3343436
Martin KRizk MSepulveda MDiguet J(2016)Notifying memoriesProceedings of the 53rd Annual Design Automation Conference10.1145/2897937.2898051(1-6)Online publication date: 5-Jun-2016
https://dl.acm.org/doi/10.1145/2897937.2898051
Kofman Ede Simone R(2016)A formal approach to the mapping of tasks on an heterogenous multicore, energy-aware architecture2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)10.1109/MEMCOD.2016.7797760(153-162)Online publication date: Nov-2016
https://doi.org/10.1109/MEMCOD.2016.7797760

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents