[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Performance analysis challenges and framework for high-performance reconfigurable computing

Published: 01 May 2008 Publication History

Abstract

Reconfigurable computing (RC) applications employing both microprocessors and FPGAs have potential for large speedup when compared with traditional (software) parallel applications. However, this potential is marred by the additional complexity of these dual-paradigm systems, making it difficult to identify performance bottlenecks and achieve desired performance. Performance analysis concepts and tools are well researched and widely available for traditional parallel applications but are lacking in RC, despite being of great importance due to the applications' increased complexity. In this paper, we explore challenges and present new techniques in automated instrumentation, runtime measurement, and visualization of RC application behavior. We also present ideas for integration with conventional performance analysis tools to create a unified tool for RC applications as well as our initial framework for FPGA instrumentation and measurement. Results from a case study are provided using a prototype of this new tool.

References

[1]
Smith, Melissa C., Vetter, Jeffery S. and Liang, Xuejun, Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS) - Workshop 3, IEEE Computer Society, Washington, DC, USA. pp. 157.2
[2]
Tripp, Justin L., Hanson, Anders A., Gokhale, Maya and Mortveit, Henning, Partitioning hardware and software for reconfigurable supercomputing applications: a case study. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (SC), IEEE Computer Society, Washington, DC, USA. pp. 27
[3]
Cray, Cray XD1 datasheet, 2005, <http://www.cray.com/downloads/Cray_XD1_Datasheet.pdf>.
[4]
XDI, XD1000¿ FPGA coprocessor module for Socket 940, <http://www.xtremedatainc.com/pdf/XD1000_Brief.pdf>.
[5]
DRC, RPU110: DRC reconfigurable processor unit, 2007, <http://www.drccomputer.com/pdfs/DRC_RPU110_datasheet.pdf>.
[6]
Shende, Sameer S. and Malony, Allen D., The TAU parallel performance system. International Journal of High Performance Computing Applications (HPCA). v20 i2. 287-311.
[7]
Chung, I. Hsin, Walkup, Robert E., Wen, Hui-Fang and Yu, Hao, MPI performance analysis tools on Blue Gene/L. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC), ACM Press, New York, NY, USA. pp. 123
[8]
Camera, Kevin, So, Hayden Kwok-Hay and Brodersen, Robert W., An integrated debugging environment for reprogrammable hardware systems. In: Proceedings of the 6th International Symposium on Automated Analysis-Driven Debugging (AADEBUG), ACM Press, New York, NY, USA. pp. 111-116.
[9]
Altera, Design debugging using the SignalTap II embedded logic analyzer, May 2007, <http://www.altera.com/literature/hb/qts/qts_qii53009.pdf>.
[10]
Xilinx, Xilinx ChipScope Pro software and cores user guide, v. 9.2i, May 2007, <http://www.xilinx.com/ise/verification/chipscope_pro_sw_cores_9_2i_ug029.pdf>.
[11]
R. DeVille, I. Troxel, A. George, Performance monitoring for run-time management of reconfigurable devices, June 2005, pp. 175-181.
[12]
Schulz, Martin, White, Brian S., McKee, Sally A., Lee, Hsien-Hsin S. and Jeitner, Jürgen, Owl: next generation system monitoring. In: Proceedings of the 2nd Conference on Computing Frontiers (CF), ACM Press, New York, NY, USA. pp. 116-124.
[13]
Graham, Paul, Nelson, Brent and Hutchings, Brad, Instrumenting bitstreams for debugging FPGA circuits. In: Proceedings of the the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), IEEE Computer Society, Washington, DC, USA. pp. 41-50.
[14]
Steve A. Guccione, Delon Levi, Prasanna Sundararajan, Jbits: a Java-based interface for reconfigurable computing, in: Proceedings of the 2nd Military and Aerospace Applications of Programmable Devices and Technologies Conference (MAPLD), September 1999, p. 27.
[15]
F. Cristian, A probabilistic approach to distributed clock synchronization, June 1989, pp. 288-296.
[16]
Xilinx, Virtex-4 family overview, January 2007, <http://direct.xilinx.com/bvdocs/publications/ds112.pdf>.
[17]
Wu, C. Eric, Bolmarcich, Anthony, Snir, Marc, Wootton, David, Parpia, Farid, Chan, Anthony, Lusk, Ewing and Gropp, William, From trace generation to visualization: a performance framework for distributed parallel systems. In: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM) (SC), IEEE Computer Society, Washington, DC, USA. pp. 50
[18]
Andreas Knüpfer, Bernhard Voigt, Wolfgang E. Nagel, Hartmut Mix, Visualization of repetitive patterns in event traces, in: Proceedings of the Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA), June 2006.
[19]
Adam Leko, Max Billingsley, III, Parallel Performance Wizard user manual, 2007, <http://ppw.hcs.ufl.edu/docs/pdf/manual.pdf>.
[20]
Adam Leko, Dan Bonachea, Hung-Hsun Su, Hans Sherburne, Bryan Golden, Alan D. George, GASP! A standardized performance analysis tool interface for global address space programming models, in: Proceedings of the Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA), June 2006.
[21]
Nallatech, H100 series FPGA application accelerators, April 2007, <http://www.nallatech.com/mediaLibrary/images/english/5595.pdf>.
[22]
Erbas, Cengiz, Sarkeshik, Seyed and Tanik, Murat M., Different perspectives of the N-Queens problem. In: Proceedings of the 1992 ACM Annual Conference on Communications (CSC), ACM Press. pp. 99-108.

Cited By

View all
  • (2023)Modular VNF Components Acceleration With FPGA OverlaysIEEE Transactions on Network and Service Management10.1109/TNSM.2022.321144820:1(846-857)Online publication date: 1-Mar-2023
  • (2014)FPGA-based Accelerators for Parallel Data SortApplied Computer Systems10.1515/acss-2014-001316:1(53-63)Online publication date: 1-Dec-2014
  • (2013)Virtualizable hardware/software design infrastructure for dynamically partially reconfigurable systemsACM Transactions on Reconfigurable Technology and Systems10.1145/2499625.24996286:2(1-18)Online publication date: 2-Aug-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Parallel Computing
Parallel Computing  Volume 34, Issue 4-5
May, 2008
94 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 May 2008

Author Tags

  1. FPGA
  2. Instrumentation
  3. Measurement
  4. Performance analysis
  5. Reconfigurable computing
  6. Visualization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Modular VNF Components Acceleration With FPGA OverlaysIEEE Transactions on Network and Service Management10.1109/TNSM.2022.321144820:1(846-857)Online publication date: 1-Mar-2023
  • (2014)FPGA-based Accelerators for Parallel Data SortApplied Computer Systems10.1515/acss-2014-001316:1(53-63)Online publication date: 1-Dec-2014
  • (2013)Virtualizable hardware/software design infrastructure for dynamically partially reconfigurable systemsACM Transactions on Reconfigurable Technology and Systems10.1145/2499625.24996286:2(1-18)Online publication date: 2-Aug-2013
  • (2012)HwPMIInternational Journal of Reconfigurable Computing10.1155/2012/1624042012(2-2)Online publication date: 1-Jan-2012
  • (2012)Optimization of Shared High-Performance Reconfigurable Computing ResourcesACM Transactions on Embedded Computing Systems10.1145/2220336.222034811:2(1-22)Online publication date: 1-Jul-2012
  • (2012)Communication visualization for bottleneck detection of high-level synthesis applicationsProceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays10.1145/2145694.2145701(33-36)Online publication date: 22-Feb-2012
  • (2011)Performance visualization for large-scale computing systemsProceedings of the 14th international conference on Human-computer interaction: design and development approaches - Volume Part I10.5555/2022384.2022438(450-460)Online publication date: 9-Jul-2011
  • (2011)Cellular Automata Simulations on a FPGA clusterInternational Journal of High Performance Computing Applications10.1177/109434201038313825:2(193-204)Online publication date: 1-May-2011
  • (2011)FPGA acceleration of communication-bound streaming applicationsInternational Journal of Reconfigurable Computing10.1155/2011/7609542011(1-11)Online publication date: 1-Jan-2011
  • (2011)Platform-aware bottleneck detection for reconfigurable computing applicationsACM Transactions on Reconfigurable Technology and Systems10.1145/2000832.20008424:3(1-28)Online publication date: 22-Aug-2011
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media