[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

PathTracer: Understanding Response Time of Signal Processing Applications on Heterogeneous MPSoCs

Published: 01 April 2022 Publication History

Abstract

In embedded and cyber-physical systems, the design of a desired functionality under constraints increasingly requires parallel execution of a set of tasks on a heterogeneous architecture. The nature of such parallel systems complicates the process of understanding and predicting performance in terms of response time. Indeed, response time depends on many factors related to both the functionality and the target architecture. State-of-the-art strategies derive response time by examining the operations required by each task for both processing and accessing shared resources. This procedure is often followed by the addition or elimination of potential interference due to task concurrency. However, such approaches require an advanced knowledge of the software and hardware details, rarely available in practice.
This work presents an alternative “top-down” strategy, called PathTracer, aimed at understanding software response time and extending the cases in which it can be analyzed and estimated. PathTracer leverages on dataflow-based application representation and response time estimation of signal processing applications mapped on heterogeneous Multiprocessor Systems-on-a-Chip (MPSoCs). Experimental results demonstrate that PathTracer provides (i) information on the nature of the application (work-dominated, span-dominated, or balanced parallel), and (ii) response time modeling which can reach high accuracy when performed post-execution, leading to prediction errors with average and standard deviation under 5% and 3% respectively.

References

[1]
Ayaz Akram and Lina Sawalha. 2019. A survey of computer architecture simulation techniques and tools. IEEE Access 7 (2019), 78120–78145. DOI:
[2]
Manel Ammar and Mohamed Abid. 2018. Heterogeneity of abstractions in EDA tools: Reviewing models of computation for many-core systems targeting intensive signal processing applications. Microprocessors and Microsystems 59 (2018), 1–14. DOI:
[3]
Vagelis Bebelis, Pascal Fradet, Alain Girault, and Bruno Lavigueur. 2013. BPDF: A statically analyzable dataflow model with integer and boolean parameters. In 2013 Proceedings of the International Conference on Embedded Software.
[4]
Bishnupriya Bhattacharya and Shuvra S. Bhattacharyya. 2001. Parameterized dataflow modeling for DSP systems. IEEE Transactions on Signal Processing 49, 10 (2001), 2408–2421.
[5]
Shuvra S. Bhattacharyya, Ed F. Deprettere, Rainer Leupers, and Jarmo Takala. 2013. Handbook of Signal Processing Systems. Springer
[6]
Greet Bilsen, Marc Engels, Rudy Lauwereins, and Jean Peperstraete. 1996. Cycle-static dataflow. IEEE Transactions on Signal Processing 44, 2(1996), 397–408.
[7]
Bing Liu. 1997. Route finding by using knowledge about the road network. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 27, 4 (1997), 436–448.
[8]
Bruno Bodin, Alix Munier-Kordon, and Benoît Dupont de Dinechin. 2012. K-periodic schedules for evaluating the maximum throughput of a synchronous dataflow graph. In Proceedings of the 2012 International Conference on Embedded Computer Systems.
[9]
Vincenzo Bonifaci, Andreas Wiese, Sanjoy K. Baruah, Alberto Marchetti-Spaccamela, Sebastian Stiller, and Leen Stougie. 2019. A generalized parallel task model for recurrent real-time processes. ACM Transactions on Parallel Computing 6, 1 (2019), 1–40. DOI:
[10]
Joseph T. Buck. 1993. Scheduling dynamic dataflow graphs with bounded memory using the token flow model. In Proceedings of the 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[11]
Joseph Tobin Buck and Edward A. Lee. 1993. Scheduling dynamic dataflow graphs with bounded memory using the token flow model. In Proceedings of the 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. IEEE, 429–432.
[12]
Nicola Carta, Carlo Sau, Francesca Palumbo, Danilo Pani, and Luigi Raffo. 2013. A coarse-grained reconfigurable wavelet denoiser exploiting the multi-dataflow composer tool. In Proceedings of the 2013 Conference on Design and Architectures for Signal and Image Processing.
[13]
Alessandro Cilardo and Edoardo Fusella. 2016. Design automation for application-specific on-chip interconnects: A survey. Integration 52 (2016), 102–121. DOI:
[14]
Patricia Derler, Edward A. Lee, and Alberto Sangiovanni Vincentelli. 2012. Modeling cyber–physical systems. In Proceedings of the IEEE 100, 1 (2012), 13–28.
[15]
Karol Desnos, Maxime Pelcat, Jean-Francois Nezan, Shuvra S. Bhattacharyya, and Slaheddine Aridhi. 2013. Pimm: Parameterized and interfaced dataflow meta-model for mpsocs runtime reconfiguration. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.
[16]
Shimon Even. 2011. Graph Algorithms (2 ed.). Cambridge University Press.
[17]
Pascal Fradet, Alain Girault, and Peter Poplavko. 2012. SPDF: A schedulable parametric data-flow MoC. In Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition.
[18]
Adi Fuchs and David Wentzlaff. 2019. The accelerator wall: Limits of chip specialization. In Proceedings of the 2019 IEEE International Symposium on High Performance Computer Architecture.
[19]
Abdoulaye Gamatié, Éric Rutten, Huafeng Yu, Pierre Boulet, and Jean-Luc Dekeyser. 2008. Synchronous modeling and analysis of data intensive applications. EURASIP Journal on Embedded Systems 2008, 1 (2008), 561863. DOI:
[20]
Marisol García-Valls, Diego Perez-Palacin, and Raffaela Mirandola. 2014. Time-sensitive adaptation in CPS through run-time configuration generation and verification. In Proceedings of the 2014 IEEE 38th Annual Computer Software and Applications Conference.
[21]
Alain Girault, Bilung Lee, and Edward A. Lee. 1999. Hierarchical finite state machines with multiple concurrency models. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 18, 6 (1999), 742–760.
[22]
Ronald L. Graham. 1969. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics 17, 2 (1969), 416–429.
[23]
Shah Ahsanul Haque, Syed Mahfuzul Aziz, and Mustafizur Rahman. 2014. Review of cyber-physical system in healthcare. International Journal of Distributed Sensor Networks 10, 4 (2014), 217415.
[24]
Hardkernel co. Ltd. [n.d.]. ODROID-XU3. Retrieved from 1 June, 2020 https://www.hardkernel.com/shop/odroid-xu3/.
[25]
Mohamed Hassan and Rodolfo Pellizzoni. 2018. Bounding DRAM interference in COTS heterogeneous MPSoCs for mixed criticality systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2323–2336.
[26]
Julien Heulot, Maxime Pelcat, Karol Desnos, Jean-Francois Nezan, and Slaheddine Aridhi. 2014. SPIDER: A synchronous parameterized and interfaced dataflow-based RTOS for multicore DSPS. In Proceedings of the 2014 6th European Embedded Design in Education and Research Conference.
[27]
Alexandre Honorat, Karol Desnos, Maxime Pelcat, and Jean-François Nezan. 2019. Modeling nested for loops with explicit parallelism in synchronous DataFlow graphs. In Proceedings of the International Conference on Embedded Computer Systems. Springer, 269–280.
[28]
Chia-Jui Hsu, Ming-Yung Ko, Shuvra S. Bhattacharyya, Suren Ramasubbu, and José Luis Pino. 2007. Efficient simulation of critical synchronous dataflow graphs. ACM Transactions on Design Automation of Electronic Systems 12, 3, Article 21 (2007), 1–28. DOI:
[29]
Rik Jongerius, Andreea Anghel, Gero Dittmann, Giovanni Mariani, Erik Vermij, and Henk Corporaal. 2018. Analytic multi-core processor model for fast design-space exploration. IEEE Transactions on Computers 67, 6 (2018), 755–770.
[30]
Gilles Kahn. 1974. The semantics of a simple language for parallel programming. In Information Processing, Proceedings of the 6th IFIP Congress 1974, Stockholm, Sweden, August 5-10, 1974. North-Holland, 471–475. In Information Processing 74 (1974), 471–475.
[31]
Kalray. [n.d.]. MPPA AccessCore. Retrieved from 1 June, 2020 https://www.kalrayinc.com/IMG/pdf/FLYER_MPPA_ACCESSCORE-2.pdf.
[32]
Torsten Kempf, Gerd Ascheid, and Rainer Leupers. 2011. Multiprocessor Systems on Chip:Design Space Exploration. Springer Science & Business Media.
[33]
Yu-Kwong Kwok and Ishfaq Ahmad. 1999. Static scheduling algorithms for allocating directed task graphs to multiprocessors. Computing Surveys 31, 4 (1999), 406–471.
[34]
Edward A. Lee and David G. Messerschmitt. 1987. Synchronous data flow. In Proceedings of the IEEE 75, 9 (1987), 1235–1245.
[35]
Edward A. Lee and Thomas M. Parks. 1995. Dataflow process networks. In Proceedings of the IEEE 83, 5 (1995), 773–801.
[36]
Rainer Leupers, Miguel Angel Aguilar, Juan Fernando Eusse, Jeronimo Castrillon, and Weihua Sheng. 2017. MAPS: A Software Development Environment for Embedded Multicore Applications. Springer Netherlands, 917–949.
[37]
Yibin Li, Min Chen, Wenyun Dai, and Meikang Qiu. 2017. Energy optimization with dynamic task scheduling mobile cloud computing. IEEE Systems Journal 11, 1 (2017), 96–105.
[38]
Joel Matejka, Björn Forsberg, Michal Sojka, Premysl Sucha, Luca Benini, Andrea Marongiu, and Zdeněk Hanzálek. 2019. Combining PREM compilation and static scheduling for high-performance and predictable MPSoC execution. Parallel Computing 85 (2019), 27–44. DOI:https://doi.org/10.1016/j.parco.2018.11.002
[39]
Alessandra Melani, Marko Bertogna, Vincenzo Bonifaci, and Alberto Marchetti-Spaccamela. 2015. Response-time analysis of conditional DAG tasks in multiprocessor systems. In Proceedings of the 2015 27th Euromicro Conference on Real-Time Systems.
[40]
Christian Menard, Jerónimo Castrillón, Matthias Jung, and Norbert Wehn. 2017. System simulation with gem5 and SystemC: The keystone for full interoperability. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.
[41]
Praveen K. Murthy and Edward A. Lee. 2002. Multidimensional synchronous dataflow. IEEE Transactions on Signal Processing 50, 8 (2002), 2064–2079.
[42]
Tony Nowatzki, Michael Sartin-Tarm, Lorenzo De Carli, Karthikeyan Sankaralingam, Cristian Estan, and Behnam Robatmili. 2013. A general constraint-centric scheduling framework for spatial architectures. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation.
[43]
Object Management Group (OMG). [n.d.]. UML Profile for MARTE. Retrieved from 1 June, 2020 https://www.omg.org/spec/MARTE/1.1/.
[44]
Kenneth O’Neal and Philip Brisk. 2018. Predictive modeling for CPU, GPU, and FPGA performance and power consumption: A survey. In Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI.
[45]
Thomas M. Parks and Edward A. Lee. 1995. Non-preemptive real-time scheduling of dataflow systems. In Proceedings of the 1995 International Conference on Acoustics, Speech, and Signal Processing.
[46]
Maxime Pelcat, Slaheddine Aridhi, Jonathan Piat, and Jean-François Nezan. 2013. Dataflow Model of Computation. 171 (2013), 53–75. DOI:
[47]
Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean-Francois Nezan, and Slaheddine Aridhi. 2014. Preesm: A dataflow-based rapid prototyping framework for simplifying multicore DSP programming. In Proceedings of the 2014 6th European Embedded Design in Education and Research Conference.
[48]
Maxime Pelcat, Alexandre Mercat, Karol Desnos, Luca Maggiani, Yanzhou Liu, Julien Heulot, Jean-Francois Nezan, Wassim Hamidouche, and Shuvra S. Bhattacharyya. 2018. Reproducible evaluation of system efficiency with a model of architecture: From theory to practice. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 10 (2018), 2050–2063.
[49]
Jonathan Piat, Shuvra S. Bhattacharyya, and Mickaël Raulet. 2009. Interface-based hierarchy for synchronous data-flow graphs. In Proceedings of the 2009 IEEE Workshop on Signal Processing Systems.
[50]
Andy D. Pimentel. 2017. Exploring exploration: A tutorial introduction to embedded systems design space exploration. IEEE Design Test 34, 1 (2017), 77–90.
[51]
William Plishker, Nimish Sane, and Shuvra S. Bhattacharyya. 2009. A generalized scheduling approach for dynamic dataflow applications. In Proceedings of the 2009 Design, Automation & Test in Europe Conference & Exhibition.
[52]
William Plishker, Nimish Sane, Mary Kiemb, Kapil Anand, and Shuvra S. Bhattacharyya. 2008. Functional DIF for rapid prototyping. In Proceedings of the 2008 19th IEEE/IFIP International Symposium on Rapid System Prototyping.
[53]
Claudius Ptolemaeus. 2014. System Design, Modeling, and Simulation using Ptolemy II. Ptolemy.org.
[54]
Ragunathan Rajkumar, Insup Lee, Lui Sha, and John Stankovic. 2010. Cyber-physical systems: The next computing revolution. In Proceedings of the Design Automation Conference.
[55]
Claudio Rubattu, Francesca Palumbo, Shuvra S. Bhattacharyya, and Maxime Pelcat. 2021. PathTracing: Raising the level of understanding of processing latency in heterogeneous MPSoCs. In Proceedings of the 2021 Drone Systems Engineering and Rapid Simulation and Performance Evaluation: Methods and Tools Proceedings.
[56]
Amit Singh, Muhammad Shafique, Akash Kumar, and Jörg Henkel. 2013. Mapping on multi/many-core systems: Survey of current and emerging trends. In Proceedings of the 2013 50th ACM/EDAC/IEEE Design Automation Conference.
[57]
Amit Kumar Singh, Piotr Dziurzanski, Hashan Roshantha Mendis, and Leandro Soares Indrusiak. 2017. A survey and comparative study of hard and soft real-time dynamic resource allocation strategies for multi-/many-core systems. Computing Surveys 50, 2 (2017), 24:1–24:40.
[58]
Oliver Sinnen. 2007. Task scheduling for parallel systems. In Proceedings of the Wiley Series on Parallel and Distributed Computing.
[59]
Sundararajan Sriram and Shuvra S. Bhattacharyya. 2018. Embedded Multiprocessors: Scheduling and Synchronization. CRC Press.
[60]
Ralf Stemmer, Hai-Dang Vu, Kim Grüttner, Sébastien Le Nours, Wolfgang Nebel, and Sébastien Pillement. 2020. Towards probabilistic timing analysis for SDFGs on tile based heterogeneous MPSoCs. In Proceedings of the 10th European Congress on Embedded Real Time Software and Systems. paper–59.
[61]
Sander Stuijk, Marc C. W. Geilen, and Twan Basten. 2006. SDF\(^3\): SDF for free. In Proceedings of the 6th International Conference on Application of Concurrency to System Design.
[62]
Bart D. Theelen, Marc C. W. Geilen, Twan Basten, Jeroen P. M. Voeten, Stefan V. Gheorghita, and Sander Stuijk. 2006. A scenario-aware data flow model for combined long-run average and worst-case performance analysis. In Proceedings of the4th ACM / IEEE International Conference on Formal Methods and Models for Co-Design
[63]
Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenström. 2008. The worst-case execution-time problem–overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems 7, 3 (2008), 53 pages.

Index Terms

  1. PathTracer: Understanding Response Time of Signal Processing Applications on Heterogeneous MPSoCs

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Modeling and Performance Evaluation of Computing Systems
      ACM Transactions on Modeling and Performance Evaluation of Computing Systems  Volume 6, Issue 4
      December 2021
      106 pages
      ISSN:2376-3639
      EISSN:2376-3647
      DOI:10.1145/3505219
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 April 2022
      Online AM: 14 February 2022
      Accepted: 01 January 2022
      Revised: 01 January 2022
      Received: 01 July 2021
      Published in TOMPECS Volume 6, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Model-based design
      2. dataflow
      3. MPSoC
      4. design space exploration
      5. processing latency
      6. signal processing applications

      Qualifiers

      • Research-article
      • Refereed

      Funding Sources

      • European Union’s

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 238
        Total Downloads
      • Downloads (Last 12 months)43
      • Downloads (Last 6 weeks)8
      Reflects downloads up to 11 Dec 2024

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media