[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2024724.2024860acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

An algorithm-architecture co-design framework for gridding reconstruction using FPGAs

Published: 05 June 2011 Publication History

Abstract

Gridding is a method of interpolating irregularly sampled data on to a uniform grid and is a critical image reconstruction step in several applications which operate on non-Cartesian sampled data. In this paper, we present an algorithm architecture co-design framework for accelerating gridding using FPGAs. We present a parameterized hardware library for accelerating gridding to support both arbitrary and regular trajectories. We further describe our kernel automation framework which supports several kernel functions through look-up-table (LUT) based Taylor polynomial evaluation. This framework is integrated using an in-house multi-FPGA development platform which provides hardware infrastructure for integrating custom accelerators. Design-space exploration is enabled by an automation flow which allows system generation from an algorithm specification. We further provide several case studies by realizing systems for nonuniform fast Fourier transform (NuFFT) with different parameter sets and porting them on to the BEE3 platform. Results show speedups of more than 16X and 2X over existing CPU and FPGA implementations respectively, and up to 5.5 times higher performance-per-watt over a comparable GPU implementation.

References

[1]
"AMD Firestream." {Online}. Available: http://en.wikipedia.org/wiki/AMD_FireStream
[2]
"Xilinx Xpower analyzer." {Online}. Available: http://www.xilinx.com/products/design_tools/logic_design/verification/xpower_an.htm
[3]
"Intel Xeon 3000 series." {Online}. Available: http://www.intel.com/Assets/en_US/PDF/datasheet/314915.pdf
[4]
G. Chen et al. Geometric Tiling for reducing power consumption in structured matrix operations. In IEEE International SOC Conference, pages 113--114, Sept. 2006.
[5]
J. Davis, C. Thacker, and C. Chang. BEE3: Revitalizing Computer Architecture Research. MSR Technical report, 2009.
[6]
N. Debroy, N. Pitsianis, and X. Sun. Accelerating Nonuniform fast Fourier transform via reduction in memory access latency. In Proc. of SPIE, volume 7074, 2008.
[7]
L. Deng, C. Chakrabarti, N. Pitsianis, and X. Sun. Automated optimization of look-up table implementation for function evaluation on FPGAs. In Proc. of SPIE, volume 7444, 2009.
[8]
C. Farabet et. al. Hardware accelerated Convolutional Neural Networks for synthetic vision systems. ISCAS: Intl. Symp. on Circuits and Systems, pages 257--260, May. 2010.
[9]
M. Fenn, S. Kunis, and D. Potts. On the computation of the Polar FFT. Applied and Computational Harmonic Analysis, 22(2):257--263, 2007.
[10]
A. Gregerson. Implementing fast MRI Gridding on GPUs via CUDA. Nvidia Tech. Rep. on Medical Imaging using CUDA, 2008.
[11]
S. Hadjitheophanous et. al. Towards Hardware Stereoscopic 3D Reconstruction a real-time FPGA computation of the disparity map. In Design, Automation Test in Europe Conference Exhibition (DATE), 2010, pages 1743--1748, Mar. 2010.
[12]
J. Keiner, S. Kunis, and D. Potts. NFFT 3.0 -- Tutorial. 2007.
[13]
S. Kestur, J. D. Davis, and O. Williams. BLAS Comparison on FPGA, CPU and GPU. IEEE Comp. Soc. Annual Symposium on VLSI, 0:288--293, 2010.
[14]
S. Kestur, S. Park, K. M. Irick, and V. Narayanan. Accelerating the Nonuniform fast Fourier transform using FPGAs. Field-Programmable Custom Computing Machines, IEEE Symp. on, 0:19--26, 2010.
[15]
A. d. C. Lucas, S. Heithecker, and R. Ernst. FlexWAFE -- A high-end real-time stream processing library for FPGAs. In DAC '07: Proceedings of the 44th annual Design Automation Conference, pages 916--921, New York, NY, USA, 2007. ACM.
[16]
T. Schiwietz, T. Chang, P. Speier, and R. Westermann. MR Image Reconstruction using the GPU. In Proc. of SPIE, volume 6142, page 61423T, 2006.
[17]
H. Schomberg and J. Timmer. The Gridding method for image reconstruction by Fourier transformation. IEEE Transactions on Medical Imaging, 14(3):596--607, Sep 1995.
[18]
T. Sorensen, T. Schaeffter, K. Noe, and M. Hansen. Accelerating the Nonequispaced fast Fourier transform on commodity graphics hardware. IEEE Tran. on Medical Imaging, 27(4), April 2008.
[19]
S. S. Stone et. al. Accelerating advanced MRI Reconstructions on GPUs. J. Parallel Distrib. Comput., 68(10):1307--1318, 2008.
[20]
C.-L. Yu, C. Chakrabarti, S. Park, and V. Narayanan. Bandwidth-intensive FPGA architecture for multi-dimensional DFT. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE Intl. Conf. on, pages 1486--1489, 2010.
[21]
Y. Zhang et. al. A special-purpose compiler for look-up table and code generation for function evaluation. In Design, Automation Test in Europe Conference Exhibition (DATE), 2010, pages 1130--1135, 2010.
[22]
Y. Zhang et. al., Exploring parallelization strategies for NuFFT data translation. In Proc. of the ACM Intl. Conf. on Embedded software, pages 187--196. ACM, 2009.

Cited By

View all
  • (2021)Jigsaw: A Slice-and-Dice Approach to Non-uniform FFT Acceleration for MRI Image Reconstruction2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00081(714-723)Online publication date: May-2021
  • (2017)Memory-Optimized Re-Gridding Architecture for Non-Uniform Fast Fourier TransformIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2017.268172364:7(1853-1864)Online publication date: Jul-2017
  • (2015)Intelligent Vision SystemsProceedings of the 2015 IEEE International Symposium on Nanoelectronic and Information Systems (iNIS)10.1109/iNIS.2015.69(77-82)Online publication date: 21-Dec-2015
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '11: Proceedings of the 48th Design Automation Conference
June 2011
1055 pages
ISBN:9781450306362
DOI:10.1145/2024724
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BEE3
  2. Cartesian
  3. Taylor polynomial evaluation
  4. gridding
  5. nonuniform fast fourier transform
  6. polar

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Jigsaw: A Slice-and-Dice Approach to Non-uniform FFT Acceleration for MRI Image Reconstruction2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00081(714-723)Online publication date: May-2021
  • (2017)Memory-Optimized Re-Gridding Architecture for Non-Uniform Fast Fourier TransformIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2017.268172364:7(1853-1864)Online publication date: Jul-2017
  • (2015)Intelligent Vision SystemsProceedings of the 2015 IEEE International Symposium on Nanoelectronic and Information Systems (iNIS)10.1109/iNIS.2015.69(77-82)Online publication date: 21-Dec-2015
  • (2013)Modular Design of Fully Pipelined Reduction Circuits on FPGAsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2012.26724:9(1818-1826)Online publication date: 1-Sep-2013
  • (2013)Local Interpolation-based Polar Format SARJournal of Signal Processing Systems10.1007/s11265-012-0720-471:3(297-312)Online publication date: 1-Jun-2013
  • (2012)Emulating Mammalian Vision on Reconfigurable HardwareProceedings of the 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines10.1109/FCCM.2012.33(141-148)Online publication date: 29-Apr-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media