[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors

Published: 01 February 1995 Publication History

Abstract

As a process executes on a processor, it builds up state in that processor s cache. In multiprogrammed workloads, the opportunity to reuse this state may be lost when a process gets rescheduled, either because intervening processes destroy its cache state or because the process may migrate to another processor. In this paper, we explore affinity scheduling, a technique that helps reduce cache misses by preferentially scheduling a process on a processor where it has run recently. Our study focuses on a bus-based multiprocessor executing a variety of workloads, including mixes of scientific, software development, and database applications. In addition to quantifying the performance benefits of exploiting affinity, our study is distinctive in that it provides low-level data from a hardware performance monitor that details why the workloads perform as they do. Overall, for the workloads studied, we show that affinity scheduling reduces the number of cache misses by 7-36%, resulting in execution time improvements of up to 10%. Although the overall improvements are small, modifying the operating system scheduler to exploit affinity appears worthwhile-affinity has no negative impact on the workloads and we show that it is extremely simple to add to existing schedulers.

References

[1]
F. Baskett, T. Jermoluk, and D. Solomon, The 4D-MP graphics superworkstation: Computing + Graphics = 40 MIPS + 40 MFLOPS and 100,000 lighted polygons per second. In Proceedings of the 33rd IEEE Computer Society International Conference-- COMPCON 88 , Feb. 1988, pp. 468-471.
[2]
M. Devarakonda and A. Mukherjee, Issues in implementation of cache-affinity scheduling. In Proceedings Winter 1992 USENIX Conference , Jan. 1992, pp. 345-357.
[3]
J. Gray, The Benchmark Handbook for Database and Transaction Processing Systems . Morgan Kaufmann, San Mateo, CA, 1991.
[4]
A. Gupta, A. Tucker, and S. Urushibara, The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications. In ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems , May 1991, pp. 120-132.
[5]
S. Leffler, M. McKusick, M. Karels, and J. Quarterman, The Design and Implementation of the 4.3BSD UNIX Operating System . Addison-Wesley, Reading, MA, 1989.
[6]
D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy, The directory-based cache coherence protocol for the DASH multiprocessor. In Proceedings of the 17th Annual International Symposium on Computer Architecture , May 1990, pp. 148-159.
[7]
E. Lusk et al., Portable Programs for Parallel Processors . Holt, Rinehart, and Winston, New York, NY, 1987.
[8]
J. D. McDonald and D. Baganoff, Vectorization of a particle simulation method for hypersonic rarified flow. In AIAA Thermodynamics, Plasmadynamics and Lasers Conference , June 1988.
[9]
J. Mogul and A. Borg, The effect of context switches on cache performance. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems , April 1991, pp. 75-84.
[10]
Oracle Corporation, Oracle Database Administrator's Guide . Oracle Corp., Belmont, CA, 1989.
[11]
E. Rothberg and A. Gupta, Techniques for improving the performance of sparse matrix factorization on multiprocessor workstations. In Proceedings of Supercomputing '90 , Nov. 1990, pp. 232-243.
[12]
M. Squillante and E. Lazowska, Using processor-cache affinity in shared-memory multiprocessor scheduling. Technical Report 89-060-01, Department of Computer Science, University of Washington, June 1989.
[13]
M. Squillante and R. Nelson, Analysis of task migration in shared-memory multiprocessor scheduling. In ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems , May 1991, pp. 143-155.
[14]
S. S. Thakkar and M. Sweiger, Performance of an OLTP application on symmetry multiprocessor system. In Proceedings of the 17th Annual International Symposium on Computer Architecture , May 1990, pp. 228-238.
[15]
R. Vaswani and J. Zahorjan, The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors. In Proceedings of the 13th ACM Symposium on Operating System Principles , Oct. 1991, pp. 26-40.

Cited By

View all
  • (2018)Package-Aware Scheduling of FaaS FunctionsCompanion of the 2018 ACM/SPEC International Conference on Performance Engineering10.1145/3185768.3186294(101-106)Online publication date: 2-Apr-2018
  • (2016)Jump over ASLRThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195686(1-13)Online publication date: 15-Oct-2016
  • (2013)Uncovering CPU load balancing policies with harmonyProceedings of the ACM International Conference on Computing Frontiers10.1145/2482767.2482784(1-10)Online publication date: 14-May-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing  Volume 24, Issue 2
Feb. 1, 1995
116 pages
ISSN:0743-7315
Issue’s Table of Contents

Publisher

Academic Press, Inc.

United States

Publication History

Published: 01 February 1995

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Package-Aware Scheduling of FaaS FunctionsCompanion of the 2018 ACM/SPEC International Conference on Performance Engineering10.1145/3185768.3186294(101-106)Online publication date: 2-Apr-2018
  • (2016)Jump over ASLRThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195686(1-13)Online publication date: 15-Oct-2016
  • (2013)Uncovering CPU load balancing policies with harmonyProceedings of the ACM International Conference on Computing Frontiers10.1145/2482767.2482784(1-10)Online publication date: 14-May-2013
  • (2012)Share memory aware schedulerProceedings of the great lakes symposium on VLSI10.1145/2206781.2206852(291-294)Online publication date: 3-May-2012
  • (2008)The shared-thread multiprocessorProceedings of the 22nd annual international conference on Supercomputing10.1145/1375527.1375541(73-82)Online publication date: 7-Jun-2008
  • (2008)P-RayLanguages and Compilers for Parallel Computing10.1007/978-3-540-89740-8_13(187-201)Online publication date: 28-Nov-2008
  • (2008)Performance Implications of Cache Affinity on Multicore ProcessorsProceedings of the 14th international Euro-Par conference on Parallel Processing10.1007/978-3-540-85451-7_17(151-161)Online publication date: 26-Aug-2008
  • (2006)Symbiotic space-sharing on SDSC's datastar systemProceedings of the 12th international conference on Job scheduling strategies for parallel processing10.5555/1757044.1757054(192-209)Online publication date: 26-Jun-2006
  • (2006)Balancing power consumption in multiprocessor systemsACM SIGOPS Operating Systems Review10.1145/1218063.121797440:4(403-414)Online publication date: 18-Apr-2006
  • (2006)Balancing power consumption in multiprocessor systemsProceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 200610.1145/1217935.1217974(403-414)Online publication date: 18-Apr-2006
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media