[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Symbiotic jobscheduling for a simultaneous multithreaded processor

Published: 12 November 2000 Publication History

Abstract

Simultaneous Multithreading machines fetch and execute instructions from multiple instruction streams to increase system utilization and speedup the execution of jobs. When there are more jobs in the system than there is hardware to support simultaneous execution, the operating system scheduler must choose the set of jobs to coscheduleThis paper demonstrates that performance on a hardware multithreaded processor is sensitive to the set of jobs that are coscheduled by the operating system jobscheduler. Thus, the full benefits of SMT hardware can only be achieved if the scheduler is aware of thread interactions. Here, a mechanism is presented that allows the scheduler to significantly raise the performance of SMT architectures. This is done without any advance knowledge of a workload's characteristics, using sampling to identify jobs which run well together.We demonstrate an SMT jobscheduler called SOS. SOS combines an overhead-free sample phase which collects information about various possible schedules, and a symbiosis phase which uses that information to predict which schedule will provide the best performance. We show that a small sample of the possible schedules is sufficient to identify a good schedule quickly. On a system with random job arrivals and departures, response time is improved as much as 17% over a schedule which does not incorporate symbiosis.

References

[1]
http://science.nas.nasa.gov/software/npb.]]
[2]
A. Agarwal, B. Lim, D. Kranz, and J. Kubiatowicz. APRIL: a processor architecture for multiprocessing. pages 104-114, May 1990.]]
[3]
R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The tera computer system. In International Conference on Supercomputing, pages 1-6, June 1990.]]
[4]
A. Arpaci-Dusseau, D. Culler, and A. Mainwaring. Scheduling with implicit information in distributed systems. In Sigmetrics, 1998.]]
[5]
R. Blumofe and C.Leiserson. Scheduling multithreaded computations by work stealing. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Nov. 1994.]]
[6]
R. Chandra, S. Devine, and B. Verghese. Scheduling and page migration for multiprocessor computer servers. In 6th International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 1994.]]
[7]
S. Chapin. Distributed and multiprocessor scheduling. ACM Computing surveys, Mar. 1996.]]
[8]
H. Cofer, N. Camp, and R. Gomperts. Turnaround vs. throughput: Optimal utilization of a multiprocessor system. In SGI Technical Reports, May 1999.]]
[9]
J. Delany. Daylight multithreading toolkit interface. http://www.daylight.com/meetings/mug99/Delany/mt/reentrant.htm May 1999.]]
[10]
K. Diefendorff. Compaq chooses smt for alpha. Microprocessor Report, 13(16), Dec. 1999.]]
[11]
M. Fillo, S. Keckler, W. Dally, N. Carter, A. Chang, Y. Gurevich, and W. Lee. The M-Machine multicomputer. In 28th Annual International Symposium on Microarchitecture, Nov. 1995.]]
[12]
A. Gupta, A. Ticker, and S. Urushibara. The impact of operating scheduling policies and synchronization methods on the performance of parallel applications. In Signetrics, pages 392-403, June 1999.]]
[13]
B. Hamidzadeh and Y. Atif. Dynamic scheduling of real-time aperiodic tasks on multiprocessor architectures. In Proceedings of the 29th Hawaii International Conference on System Sciences, Oct. 1999.]]
[14]
H. Hirata, K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizawa. An elementary processor architecture with simultaneous instruction issuing from multiple threads. In isca92, pages 136-145, May 1992.]]
[15]
W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph. Implications of i/o for gang scheduled workloads. In 3rd Workshop on Job Scheduling Strategies for Parallel Processing, Apr. 1997.]]
[16]
S. Leffer, M. McKusick, M. Karels, and J. Quarterman. The Design and Implementation of the 4.3BSD UNIX Operating System. Addison-Wesley, 1989.]]
[17]
J. Little. A simple proof of the queuing formula L = W. Operations Research, 9:383-387, 1961.]]
[18]
J. L. Lo, S. J. Eggers, J. S. Emer, H. M. Levy, R.L. Stamm, and D. Tullsen. Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading. In ACM Transactions on Computer Systems, Aug. 1997.]]
[19]
H. Patterson and G. Gibson. Exposing i/o concurrency with informed prefetching. In Proceedings of Third International Conference on Parallel and Distributed Information Systems, Sept. 1994.]]
[20]
K. Schauser, D. Culler, and E. Thorsten. Compiler-controlled multithreading for lenient parallel languages. In Proceedings of FPCA '91 Conference on Functional Programming Languages and Computer Architecture, July 1991.]]
[21]
F. Silva and I. Scherson. Improving throughput and utilization in parallel machines through concurrent gang. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium, May 2000.]]
[22]
S. Sistare, N. Nevin, T. Kimball, and E. Loh. Coscheduling mpi jobs using the spin daemon. In SC 99, Nov. 1999.]]
[23]
A. Snavely and L. Carter. Symbiotic jobscheduling on the MTA. In Workshop on Multi-Threaded Execution, Architecture, and Compilers, Jan. 2000.]]
[24]
A. Snavely, N. Mitchell, L. Carter, J. Ferrante, and D. Tullsen. Explorations in symbiosis on two multithreaded architectures. In Workshop on Multi-Threaded Execution, Architecture, and Compilers, Jan. 1999.]]
[25]
P. Sobalvarro, S. Pakin, W. Weihl, and A. Chien. Dynamic coscheduling on workstation clusters. In SRC Technical Note 1997-017, Mar. 1997.]]
[26]
P. G. Sobalvarro and W. E. Weihl. Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors. In IPPS95, pages 63-75, Apr. 1995.]]
[27]
K. Thompson. Unix implementation. In The Bell System Technical Journal, July 1978.]]
[28]
K. Thompson and D. Ritchie. The unix time-sharing system. In Communications of the ACM, July 1974.]]
[29]
J. Torrelas, A. Tucker, and A. Gupta. Benefits of cache-affinity scheduling issues for multiprogrammed shared memory multi-processors. In 1993 ACM Sigmetrics, May 1993.]]
[30]
A. Tucker and A. Gupta. Process control and scheduling issues for multiprogrammed shared memory multiprocessors. In Symposium on Operating Systems Principals, Dec. 1989.]]
[31]
D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stamm. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In ISCA96, pages 191-202, May 1996.]]
[32]
D. Tullsen, S. Eggers, and H. Levy. Simultaneous multithreading: Maximizing on-chip parallelism. In ISCA95, pages 392-403, June 1995.]]
[33]
D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.]]
[34]
R. Vaswani and J. Zahorjan. The implications of cache-affinity on processor scheduling for multiprogrammed, shared memory multiprocessors. In Symposium on Operating Systems Principals, Oct. 1991.]]
[35]
W. Yamamoto and M. Nemirovsky. Increasing superscalar performance through multistreaming. In Conference on Parallel Architectures and Compilation Techniques, pages 49-58, June 1995.]]

Cited By

View all
  • (2024)BreakHammer: Enhancing RowHammer Mitigations by Carefully Throttling Suspect Threads2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00072(915-934)Online publication date: 2-Nov-2024
  • (2024)The Maya Cache: A Storage-efficient and Secure Fully-associative Last-level Cache2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00013(32-44)Online publication date: 29-Jun-2024
  • (2023)Resource scheduling techniques in cloud from a view of coordination: a holistic survey从协同视角论云资源调度技术:综述Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210029824:1(1-40)Online publication date: 23-Jan-2023
  • Show More Cited By

Index Terms

  1. Symbiotic jobscheduling for a simultaneous multithreaded processor

                        Recommendations

                        Comments

                        Please enable JavaScript to view thecomments powered by Disqus.

                        Information & Contributors

                        Information

                        Published In

                        cover image ACM SIGOPS Operating Systems Review
                        ACM SIGOPS Operating Systems Review  Volume 34, Issue 5
                        Dec. 2000
                        269 pages
                        ISSN:0163-5980
                        DOI:10.1145/384264
                        Issue’s Table of Contents
                        • cover image ACM Conferences
                          ASPLOS IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
                          November 2000
                          271 pages
                          ISBN:1581133170
                          DOI:10.1145/378993
                        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                        Publisher

                        Association for Computing Machinery

                        New York, NY, United States

                        Publication History

                        Published: 12 November 2000
                        Published in SIGOPS Volume 34, Issue 5

                        Check for updates

                        Qualifiers

                        • Article

                        Contributors

                        Other Metrics

                        Bibliometrics & Citations

                        Bibliometrics

                        Article Metrics

                        • Downloads (Last 12 months)229
                        • Downloads (Last 6 weeks)30
                        Reflects downloads up to 01 Jan 2025

                        Other Metrics

                        Citations

                        Cited By

                        View all
                        • (2024)BreakHammer: Enhancing RowHammer Mitigations by Carefully Throttling Suspect Threads2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00072(915-934)Online publication date: 2-Nov-2024
                        • (2024)The Maya Cache: A Storage-efficient and Secure Fully-associative Last-level Cache2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00013(32-44)Online publication date: 29-Jun-2024
                        • (2023)Resource scheduling techniques in cloud from a view of coordination: a holistic survey从协同视角论云资源调度技术:综述Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210029824:1(1-40)Online publication date: 23-Jan-2023
                        • (2021)Analytical and Numerical Evaluation of Co-Scheduling Strategies and Their ApplicationComputers10.3390/computers1010012210:10(122)Online publication date: 2-Oct-2021
                        • (2021)Near-optimal replacement policies for shared caches in multicore processorsThe Journal of Supercomputing10.1007/s11227-021-03736-1Online publication date: 29-Mar-2021
                        • (2021)An Analytical Bound for Choosing Trivial Strategies in Co-schedulingComputational Science and Its Applications – ICCSA 202110.1007/978-3-030-87010-2_28(381-395)Online publication date: 10-Sep-2021
                        • (2019)The Implementation of B1C Multithreaded Software Receiver Based on SIMD Technique2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)10.1109/ICSESS47205.2019.9040852(1-7)Online publication date: Oct-2019
                        • (2018)Designing lab sessions focusing on real processors for computer architecture coursesJournal of Parallel and Distributed Computing10.1016/j.jpdc.2018.02.026118:P1(128-139)Online publication date: 1-Aug-2018
                        • (2016)Time Donating Barrier for efficient task scheduling in competitive multicore systemsFuture Generation Computer Systems10.1016/j.future.2015.04.00554:C(469-477)Online publication date: 1-Jan-2016
                        • (2016)The case for colocation of high performance computing workloadsConcurrency and Computation: Practice & Experience10.1002/cpe.318728:2(232-251)Online publication date: 1-Feb-2016
                        • Show More Cited By

                        View Options

                        View options

                        PDF

                        View or Download as a PDF file.

                        PDF

                        eReader

                        View online with eReader.

                        eReader

                        Login options

                        Media

                        Figures

                        Other

                        Tables

                        Share

                        Share

                        Share this Publication link

                        Share on social media