research-article

DCS-ctrl: a fast and flexible device-control mechanism for device-centric server architecture

Authors:

Jangwoo KimAuthors Info & Claims

ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

Pages 491 - 504

https://doi.org/10.1109/ISCA.2018.00048

Published: 02 June 2018 Publication History

Get Access

Abstract

Modern high-performance servers leverage a large number of emerging peripheral devices (e.g., data processing accelerators, non-volatile memory storage, high-bandwidth network cards) to meet ever-increasing performance demands of server applications. However, as such servers experience severe kernel overhead due to frequently invoked device operations (e.g., buffer management and data copy), server architects have proposed various hardware and software approaches to enable direct communications among the devices. Unfortunately, existing direct device-to-device (D2D) communication schemes still suffer from low performance and the lack of flexibility. First, software-based schemes depend on complicated kernel routines and necessitate multiple hardware-software and user-kernel boundary crossings, which significantly limit the performance improvement opportunities from direct D2D communications. On the other hand, hardware-based schemes require tight integration and custom-built devices, preventing architects from flexibly adding off-the-shelf devices.

In this paper, we propose DCS-ctrl, a novel Hardware-based Device-Control (HDC) mechanism for Device-Centric Server (DCS) architecture to provide fast and CPU-efficient direct D2D communications among a large number of off-the-shelf peripheral devices. The key idea of DCS-ctrl is to implement a low-cost and flexible device-control mechanism on an independent FPGA device called HDC Engine. As HDC Engine manages all data and control transfers among devices at the hardware level, the server achieves high performance, scalability, and flexibility. First, optimizing both data and control paths at the hardware level minimizes the latency of inter-device communications. Second, implementing FPGA-based reconfigurable device controllers enables direct D2D communications among commodity devices and thus improves per-device flexibility. Third, merging heterogeneous device operations with intermediate data processing supports creates more opportunities for direct inter-device communications in server applications. Our DCS-ctrl prototype reduces the latency of software-based direct D2D communications by 42% and the CPU utilization by 52%.

References

[1]

N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers, R. Boyle, P. luc Cantin, C. Chao, C. Clark, J. Coriell, M. Daley, M. Dau, J. Dean, B. Gelb, T. V. Ghaemmaghami, R. Gottipati, W. Gulland, R. Hagmann, C. R. Ho, D. Hogberg, J. Hu, R. Hundt, D. Hurt, J. Ibarz, A. Jaffey, A. Jaworski, A. Kaplan, H. Khaitan, D. Killebrew, A. Koch, N. Kumar, S. Lacy, J. Laudon, J. Law, D. Le, C. Leary, Z. Liu, K. Lucke, A. Lundin, G. MacKean, A. Maggiore, M. Mahony, K. Miller, R. Nagarajan, R. Narayanaswami, R. Ni, K. Nix, T. Norrie, M. Omernick, N. Penukonda, A. Phelps, J. Ross, M. Ross, A. Salek, E. Samadiani, C. Severn, G. Sizikov, M. Snelham, J. Souter, D. Steinberg, A. Swing, M. Tan, G. Thorson, B. Tian, H. Toma, E. Tuttle, V. Vasudevan, R. Walter, W. Wang, E. Wilcox, and D. H. Yoon, "In-Datacenter Performance Analysis of a Tensor Processing Unit," in Proc. 44th International Symposium on Computer Architecture (ISCA), 2017.

Abstract

References

Cited By

Index Terms

Recommendations

DCS: a fast and scalable device-centric server architecture

A power-efficient and high performance FPGA accelerator for convolutional neural networks: work-in-progress

LW-GCN: A Lightweight FPGA-based Graph Convolutional Network Accelerator

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations