[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2688394.2688396acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Analysis of the effect of core affinity on high-throughput flows

Published: 16 November 2014 Publication History

Abstract

Network throughput is scaling-up to higher data rates while end-system processors are scaling-out to multiple cores. In order to optimize high speed data transfer into multicore end-systems, techniques such as network adapter offloads and performance tuning have received a great deal of attention. Furthermore, several methods of multithreading the network receive process have been proposed. However, thus far attention has been focused on how to set the tuning parameters and which offloads to select for higher performance, and little has been done to understand why the settings do (or do not) work. In this paper we build on previous research to track down the source(s) of the end-system bottleneck for high-speed TCP flows. For the purposes of this paper, we consider protocol processing efficiency to be the amount of system resources used (such as CPU and cache) per unit of achieved throughout (in Gbps). The amount of various system resources consumed are measured using low-level system event counters. Affinitization, or core binding, is the decision about which processor cores on an end system are responsible for interrupt, network, and application processing. We conclude that affinitization has a significant impact on protocol processing efficiency, and that the performance bottleneck of the network receive process changes drastically with three distinct affinitization scenarios.

References

[1]
G. Keiser, Optical Fiber Communications. John Wiley & Sons, Inc., 2003.
[2]
C. Benvenuti, Understanding Linux Network Internals. O'Reilly Media, 2005.
[3]
N. Hanford, V. Ahuja, M. Balman, M. K. Farrens, D. Ghosal, E. Pouyoul, and B. Tierney, "Characterizing the impact of end-system affinities on the end-to-end performance of high-speed flows," in Proceedings of the Third International Workshop on Network-Aware Data Management, NDM '13, (New York, NY, USA), pp. 1:1--1:10, ACM, 2013.
[4]
N. Hanford, V. Ahuja, M. Balman, M. K. Farrens, D. Ghosal, E. Pouyoul, and B. Tierney, "Impact of the end-system and affinities on the throughput of high-speed flows." poster - Proceedings of The Tenth ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) ANCS14, 2014.
[5]
A. Pande and J. Zambreno, "Efficient translation of algorithmic kernels on large-scale multi-cores," in Computational Science and Engineering, 2009. CSE'09. International Conference on, vol. 2, pp. 915--920, IEEE, 2009.
[6]
A. Foong, J. Fung, and D. Newell, "An in-depth analysis of the impact of processor affinity on network performance," in Networks, 2004. (ICON 2004). Proceedings. 12th IEEE International Conference on, vol. 1, pp. 244--250 vol.1, Nov 2004.
[7]
M. Faulkner, A. Brampton, and S. Pink, "Evaluating the performance of network protocol processing on multi-core systems," in Advanced Information Networking and Applications, 2009. AINA '09. International Conference on, pp. 16--23, May 2009.
[8]
J. Mogul and K. Ramakrishnan, "Eliminating receive livelock in an interrupt-driven kernel," ACM Transactions on Computer Systems (TOCS), vol. 15, no. 3, pp. 217--252, 1997.
[9]
J. Salim, "When napi comes to town," in Linux 2005 Conf, 2005.
[10]
T. Marian, D. Freedman, K. Birman, and H. Weatherspoon, "Empirical characterization of uncongested optical lambda networks and 10gbe commodity endpoints," in Dependable Systems and Networks (DSN), 2010 IEEE/IFIP International Conference on, pp. 575--584, IEEE, 2010.
[11]
T. Marian, Operating systems abstractions for software packet processing in datacenters. PhD thesis, Cornell University, 2011.
[12]
S. Larsen, P. Sarangam, R. Huggahalli, and S. Kulkarni, "Architectural breakdown of end-to-end latency in a tcp/ip network," International Journal of Parallel Programming, vol. 37, no. 6, pp. 556--571, 2009.
[13]
W. Wu, P. DeMar, and M. Crawford, "A transport-friendly nic for multicore/multiprocessor systems," Parallel and Distributed Systems, IEEE Transactions on, vol. 23, no. 4, pp. 607--615, 2012.
[14]
G. Liao, X. Zhu, and L. Bhuyan, "A new server i/o architecture for high speed networks," in High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, pp. 255--265, IEEE, 2011.
[15]
S. Networking, "Eliminating the receive processing bottleneckintroducing rss," Microsoft WinHEC (April 2004), 2004.
[16]
T. Herbert, "rps: receive packet steering, september 2010." http://lwn.net/Articles/361440/.
[17]
T. Herbert, "rfs: receive flow steering, september 2010." http://lwn.net/Articles/381955/.
[18]
V. Ahuja, M. Farrens, and D. Ghosal, "Cache-aware affinitization on commodity multicores for high-speed network flows," in Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems, pp. 39--48, ACM, 2012.
[19]
A. Foong, J. Fung, D. Newell, S. Abraham, P. Irelan, and A. Lopez-Estrada, "Architectural characterization of processor affinity in network processing," in Performance Analysis of Systems and Software, 2005. ISPASS 2005. IEEE International Symposium on, pp. 207--218, IEEE, 2005.
[20]
G. Narayanaswamy, P. Balaji, and W. Feng, "Impact of network sharing in multi-core architectures," in Computer Communications and Networks, 2008. ICCCN'08. Proceedings of 17th International Conference on, pp. 1--6, IEEE, 2008.
[21]
B. Weller and S. Simon, "Closed loop method and apparatus for throttling the transmit rate of an ethernet media access controller," Aug. 26 2008. US Patent 7,417,949.
[22]
M. Mathis, "Raising the internet mtu," http://www.psc.edu/mathis/MTU, 2009.
[23]
W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and I. Foster, "The globus striped gridftp framework and server," in Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p. 54, IEEE Computer Society, 2005.
[24]
S. Han, S. Marshall, B.-G. Chun, and S. Ratnasamy, "Megapipe: A new programming interface for scalable network i/o.," in OSDI, pp. 135--148, 2012.
[25]
M. Balman and T. Kosar, "Data scheduling for large scale distributed applications," in Proceedings of the 9th International Conference on Enterprise Information Systems Doctoral Symposium (DCEIS 2007), DCEIS 2007, 2007.
[26]
M. Balman, Data Placement in Distributed Systems: Failure Awareness and Dynamic Adaptation in Data Scheduling. VDM Verlag, 2009.
[27]
M. Balman and T. Kosar, "Dynamic adaptation of parallelism level in data transfer scheduling," in Complex, Intelligent and Software Intensive Systems, 2009. CISIS '09. International Conference on, pp. 872--877, March 2009.
[28]
M. Balman, E. Pouyoul, Y. Yao, E. W. Bethel, B. Loring, M. Prabhat, J. Shalf, A. Sim, and B. L. Tierney, "Experiences with 100gbps network applications," in Proceedings of the Fifth International Workshop on Data-Intensive Distributed Computing, DIDC '12, (New York, NY, USA), pp. 33--42, ACM, 2012.
[29]
M. Balman, "Memznet: Memory-mapped zero-copy network channel for moving large datasets over 100gbps network," in Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC '12, IEEE Computer Society, 2012.
[30]
E. He, J. Leigh, O. Yu, and T. Defanti, "Reliable blast udp: predictable high performance bulk data transfer," in Cluster Computing, 2002. Proceedings. 2002 IEEE International Conference on, pp. 317--324, 2002.
[31]
Y. Gu and R. L. Grossman, "Udt: Udp-based data transfer for high-speed wide area networks," Computer Networks, vol. 51, no. 7, pp. 1777--1799, 2007. Protocols for Fast, Long-Distance Networks.
[32]
R. Recio, P. Culley, D. Garcia, J. Hilland, and B. Metzler, "An rdma protocol specification," tech. rep., IETF Internet-draft draft-ietf-rddp-rdmap-03. txt (work in progress), 2005.
[33]
I. T. Association et al., InfiniBand Architecture Specification: Release 1.0. InfiniBand Trade Association, 2000.
[34]
ESnet, "Linux tuning, http://fasterdata.es.net/host-tuning/linux."
[35]
ESnet, "iperf3, http://fasterdata.es.net/performance-testing/network-troubleshooting-tools/iperf-and-iperf3/."
[36]
E. Dart, L. Rotman, B. Tierney, M. Hester, and J. Zurawski, "The science dmz: A network design pattern for data-intensive science," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '13, (New York, NY, USA), pp. 85:1--85:10, ACM, 2013.
[37]
"Esnet 100gbps testbed." http://www.es.net/RandD/100g-testbed.
[38]
J. Levon and P. Elie, "Oprofile: A system profiler for linux." http://oprofile.sf.net, 2004.

Cited By

View all
  • (2018)A Survey of End-System Optimizations for High-Speed NetworksACM Computing Surveys10.1145/318489951:3(1-36)Online publication date: 16-Jul-2018
  • (2016)Long-haul secure data transfer using hardware-assisted GridFTPFuture Generation Computer Systems10.1016/j.future.2015.09.01456:C(265-276)Online publication date: 1-Mar-2016

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
NDM '14: Proceedings of the Fourth International Workshop on Network-Aware Data Management
November 2014
37 pages
ISBN:9781479970193
  • General Chairs:
  • Mehmet Balman,
  • Surendra Byna,
  • Brian L. Tierney

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 November 2014

Check for updates

Qualifiers

  • Research-article

Conference

SC '14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 14 of 23 submissions, 61%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)A Survey of End-System Optimizations for High-Speed NetworksACM Computing Surveys10.1145/318489951:3(1-36)Online publication date: 16-Jul-2018
  • (2016)Long-haul secure data transfer using hardware-assisted GridFTPFuture Generation Computer Systems10.1016/j.future.2015.09.01456:C(265-276)Online publication date: 1-Mar-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media