[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2502524.2502548acmconferencesArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Zero-copy I/O processing for low-latency GPU computing

Published: 08 April 2013 Publication History

Abstract

Cyber-physical systems (CPS) aim to monitor and control complex real-world phenomena where the computational cost and real-time constraints could be a major challenge. Many-core hardware accelerators such as graphics processing units (GPUs) promise to enhancing computation, leveraging the data parallelism often found in real-world scenarios of CPS, but performance is limited by the overhead of the data transfer between the host and the device memory. For example, plasma control in the HBT-EP Tokamak device at Columbia University [11, 18] must execute the control algorithm in a few microseconds, but may take tens of microseconds to copy the data set between the host and the device memory. This paper presents a zero-copy I/O processing scheme that maps the I/O address space of the system to the virtual address space of the compute device, allowing sensors and actuators to transfer data to and from the compute device directly. Experiments using the plasma control system show a 33% reduction in computational cost, and microbenchmarks with more generic matrix operations show a 34% reduction, while in both cases, effective data throughput remains at least as good as the current best performers.

References

[1]
G. Elliott and J. Anderson. Globally Scheduled Real-Time Multiprocessor Systems with GPUs. Real-Time Systems, 48(1):34--74, 2012.
[2]
G. Elliott and J. Anderson. Robust Real-Time Multiprocessor Interrupt Handling Motivated by GPUs. In Proc. of the Euromicro Conference on Real-Time Systems, pages 267--276, 2012.
[3]
M. Hirabayashi, S. Kato, M. Edahiro, and Y. Sugiyama. Toward GPU-accelerated traffic simulation and its real-time challenge. In Proc. of the International Workshop on Real-Time and Distributed Computing in Emerging Applications, 2012.
[4]
T. Jablin, P. Prabhu, J. Jablin, N. Johnson, S. Beard, and D. August. Automatic CPU-GPU communication management and optimization. In Proc. of the ACM Conference on Programming Language Design and Implementation, 2011.
[5]
S. Kato, K. Lakshmanan, Y. Ishikawa, and R. Rajkumar. Resource Sharing in GPU-accelerated Windowing Systems. In Proc. of the IEEE Real-Time and Embedded Technology and Aplications Symposium, pages 191--200, 2011.
[6]
S. Kato, K. Lakshmanan, A. Kumar, M. Kelkar, Y. Ishikawa, and R. Rajkumar. RGEM: A Responsive GPGPU Execution Model for Runtime Engines. In Proc. of the IEEE Real-Time Systems Symposium, pages 57--66, 2011.
[7]
S. Kato, K. Lakshmanan, R. Rajkumar, and Y. Ishikawa. TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments. In Proc. of the USENIX Annual Technical Conference, 2011.
[8]
S. Kato, M. McThrow, C. Maltzahn, and S. Brandt. Gdev: First-Class GPU Resource Management in the Operating System. In Proc. of the USENIX Annual Technical Conference, 2012.
[9]
C. Liu, J. Li, W. Huang, J. Rubio, E. Speight, and X. Lin. Power-Efficient Time-Sensitive Mapping in Heterogeneous Systems. In Proc. of the International Conference on Parallel Architectures and Compilation Techniques, 2012.
[10]
R. Mangharam and A. Saba. Anytime Algorithms for GPU Architectures. In Proc. of the IEEE Real-Time Systems Symposium, pages 47--56, 2011.
[11]
D. Maurer, J. Bialek, P. Byrne, B. D. Bono, J. Levesque, and e. a. B. Q. Li. The high beta tokamak-extended pulse magnetohydrodynamic mode control research program. Plasma Physics and Controlled Fusion, 53, 2011.
[12]
M. McNaughton, C. Urmson, J. Dolan, and J.-W. Lee. Motion Planning for Autonomous Driving with a Conformal Spatiotemporal Lattice. In Proc. of the IEE International Conference on Robotics and Automation, pages 4889--4895, 2011.
[13]
Mellanox. NVIDIA GPUDirect Technology--Accelerating GPU-based Systems, 2010.
[14]
P. Michel, J. Chestnutt, S. Kagami, K. Nishiwaki, J. Kuffner, and T. Kanade. GPU-accelerated Real-Time 3D Tracking for Humanoid Locomotion and Stair Climbing. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 463--469, 2007.
[15]
NVIDIA. NVIDIA's next generation CUDA computer architecture: Fermi, 2009.
[16]
NVIDIA. CUDA C Programming Guide Version 4.2, 2012.
[17]
NVIDIA. NVIDIA GeForce GTX 680: The fastest, most efficient GPU ever built, 2012.
[18]
N. Rath, J. Bialek, P. Byrne, B. DeBono, J. Levesque, B. Li, M. Mauel, D. Maurer, G. Navratil, and D. Shiraki. High-speed, multi-input, multi-output control using GPU processing in the HBT-EP tokamak. Fusion Engineering and Design, 2012.
[19]
C. Rossbach, J. Currey, M. Silberstein, B. Ray, and E. Witchel. PTask: Operating system abstractions to manage GPUs as compute devices. In Proc. of the ACM Symposium on Operating Systems Principles, 2011.

Cited By

View all
  • (2023)Parallel Shooting Sequential Quadratic Programming for Nonlinear MPC Problems2023 IEEE Conference on Control Technology and Applications (CCTA)10.1109/CCTA54093.2023.10252893(605-611)Online publication date: 16-Aug-2023
  • (2021)A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated In-Orbit Space SystemsTRANSACTIONS OF THE JAPAN SOCIETY FOR AERONAUTICAL AND SPACE SCIENCES, AEROSPACE TECHNOLOGY JAPAN10.2322/tastj.19.70019:5(700-708)Online publication date: 2021
  • (2021)OpenUVR: an Open-Source System Framework for Untethered Virtual Reality Applications2021 IEEE 27th Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS52030.2021.00026(223-236)Online publication date: May-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICCPS '13: Proceedings of the ACM/IEEE 4th International Conference on Cyber-Physical Systems
April 2013
278 pages
ISBN:9781450319966
DOI:10.1145/2502524
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICCPS '13
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)9
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Parallel Shooting Sequential Quadratic Programming for Nonlinear MPC Problems2023 IEEE Conference on Control Technology and Applications (CCTA)10.1109/CCTA54093.2023.10252893(605-611)Online publication date: 16-Aug-2023
  • (2021)A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated In-Orbit Space SystemsTRANSACTIONS OF THE JAPAN SOCIETY FOR AERONAUTICAL AND SPACE SCIENCES, AEROSPACE TECHNOLOGY JAPAN10.2322/tastj.19.70019:5(700-708)Online publication date: 2021
  • (2021)OpenUVR: an Open-Source System Framework for Untethered Virtual Reality Applications2021 IEEE 27th Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS52030.2021.00026(223-236)Online publication date: May-2021
  • (2019)Optimization Methods for Computing System in Mobile CPSProceedings of the 2nd International Conference on Big Data Technologies10.1145/3358528.3358551(300-305)Online publication date: 28-Aug-2019
  • (2018)Cooperative GPGPU Scheduling for Consolidating Server WorkloadsIEICE Transactions on Information and Systems10.1587/transinf.2018EDP7027E101.D:12(3019-3037)Online publication date: 1-Dec-2018
  • (2018)MorpheusACM SIGOPS Operating Systems Review10.1145/3273982.327398952:1(71-83)Online publication date: 28-Aug-2018
  • (2017)GLoopProceedings of the 2017 Symposium on Cloud Computing10.1145/3127479.3132023(80-93)Online publication date: 24-Sep-2017
  • (2017)On Building a Programmable Wireless High-Quality Virtual Reality System Using Commodity HardwareProceedings of the 8th Asia-Pacific Workshop on Systems10.1145/3124680.3124723(1-7)Online publication date: 2-Sep-2017
  • (2017)A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthquake Detection Code for Large Seismic DatasetsPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093382(1-6)Online publication date: 9-Jul-2017
  • (2017)Real-Time GPU Resource Management with Loadable Kernel ModulesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.263069728:6(1715-1727)Online publication date: 1-Jun-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media