[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/195473.195531acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article
Free access

Surpassing the TLB performance of superpages with less operating system support

Published: 01 November 1994 Publication History

Abstract

Many commercial microprocessor architectures have added translation lookaside buffer (TLB) support for superpages. Superpages differ from segments because their size must be a power of two multiple of the base page size and they must be aligned in both virtual and physical address spaces. Very large superpages (e.g., 1MB) are clearly useful for mapping special structures, such as kernel data or frame buffers. This paper considers the architectural and operating system support required to exploit medium-sized superpages (e.g., 64KB, i.e., sixteen times a 4KB base page size). First, we show that superpages improve TLB performance only after invasive operating system modifications that introduce considerable overhead.
We then propose two subblock TLB designs as alternate ways to improve TLB performance. Analogous to a subblock cache, a complete-subblock TLB associates a tag with a superpage-sized region but has valid bits, physical page number, attributes, etc., for each possible base page mapping. A partial-subblock TLB entry is much smaller than a complete-subblock TLB entry, because it shares physical page number and attribute fields across base page mappings. A drawback of a partial-subblock TLB is that base page mappings can share a TLB entry only if they map to consecutive physical pages and have the same attributes. We propose a physical memory allocation algorithm, page reservation, that makes this sharing more likely. When page reservation is used, experimental results show partial-subblock TLBs perform better than superpage TLBs, while requiring simpler operating system changes. If operating system changes are inappropriate, however, complete-subblock TLBs perform best.

References

[1]
Andrew W. Appel and David B. McQueen_ Standard ML of New Jersey. In Proc. Third International Symposium on Programming Language implementation and Logic Programming, pages 1-13, August 1991.
[2]
David Bailey, John Barton, Thomas Lasinski, Horst Simon. The NAS Parallel Benchmarks. Report RNR-91-002 Revision 2, Ames Research Center, August I991.
[3]
J. Bradley Chen, Anita Borg, Norman P. Jouppi. A Simulation Based Study of TLB Performance. In Proc. of the 19th Annual International Symposium on Computer Architecture, pages 114-123, May 1992.
[4]
Peter J. Denning. Virtual Memory. Computing Surveys, 2(3):153-189, September 1970.
[5]
Yannick Deville and Jean Gobert. A class of replacement policies for medium and high associativity structures. Computer Architecture News, 20(1):55-64, March 1992.
[6]
James R. Goodman. Using Cache Memory to Reduce Processor-Memory Traffic. In Proc. of the Tenth Annual International Symposium on Computer Architecture, pages 124-131, Stockholm Sweden, June 1983.
[7]
John L Hennessy and David A Patterson. Computer Architecture A Quantitative Approach. Morgan Kaufmann Publishers, 1990.
[8]
Mark D. Hill and Alan Jay Smith. Experimental Evaluation of On-Chip Microprocessor Cache Memories. In Proc. of the l l th Annual International Symposium on Computer Architecture, pages 158-166, Ann Arbor MI, June 1984.
[9]
Norman P. Jouppi and Steven J. E. Wilson. Tradeoffs in Two- Level On-Chip Caching. In Proc. of the 21st Annual international Symposium on Computer Architecture, April 1994.
[10]
Toyohiko Kagimasa, Kikuo Takahashi, Toshiaki Mori. Adaptive Storage Management for Very Large Virtual/Real Storage Systems. In Proc. of the 18th Annual International Symposium on Computer Architecture, pages 372-379, May 1991.
[11]
Gerry Kane and Joe Heinrich. MiPS RISC Architecture. Prentice Hall, 1992.
[12]
R.E. Kessler and Mark D. Hill. Page Placement Algorithms for Large Real-Index Caches. A CM Transactions on Computer Systems, 10(4):338-359, November 1992.
[13]
Yousef A. Khalidi, Madhusudhan Talluri, Michael N. Nelson, Dock Williams. Virtual Memory Support for Multiple Page Sizes. in Proc. of the Fourth Workshop on Workstation Operating Systems, pages 104-109, Napa CA, October 1993.
[14]
Donald E. Knuth. The Art of Computer Programming, Volume 1. Addison Wesley, 1968. Second Printing.
[15]
J.S. Liptay. Structural aspects of the System/360 Model 85, Part II: the cache. IBM Systems Journal, 7(i):15-21, 1968.
[16]
M. K. McKusick, W. N. Joy, S. J. Leffler, R. S. Fabry. A Fast File System for UNIX. A CM Transactions on Computer Systems, 2(3):191-197, August 1984.
[17]
Milan Milenkovic. Microprocessor Memory Management Units. IEEE Micro, 10(2):70-85, April 1990.
[18]
MIPS Technologies, Inc. TFP Microprocessor Chip Set: Preliminary Product Information, October 1993.
[19]
Jeffrey C. Mogul. Big Memories on the Desktop. In Proc. of the Fourth Workshop on Workstation Operating Systems, pages 110-115, Napa CA, October 1993.
[20]
Johannes M. Mulder, N. T. Quach, Michael J. Flynn. An Area Model for On-Chip Memories and its Applications. IEEE Journal of Solid State Circuits, 26(2):98-106, February 1991.
[21]
David Nagle, Richard Uhlig, Trevor Mudge. Monster: A Tool for Analyzing the interaction Between Operating Systems and Computer Architecture. University of Michigan Technical Report, May 1992.
[22]
David Nagle, Richard Uhlig, Trevor Mudge, Stuart Sechrest. Optimal Allocation of On-Chip Memory for Multiple-API Operating Systems. In Proc. of the 21st Annual International Sympoaium on Computer Architecture, April 1994.
[23]
E.J. Organick. The Multics System: An Examination of Its Structure. MIT Press, Cambridge, MA, 1972.
[24]
J.L. Peterson and N. Theodore. Buddy Systems. Communications of the ACM, 20(6):421-43I, June 1977.
[25]
Raghu Ramakrishnan, Divesh Srivastava, S. Sudarshan, Praveen Seshadri. Implementation of the CORAL Deductive Database System. In Proceedings of A CM SIGMOD International Conference on Management of Data, 1993.
[26]
Steven K. Reinhardt, Mark D. Hill, James R. Larus, Alvin R. Lebeck, James C. Lewis, David A. Wood. The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers. In Proc. A CM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 48-60, May 1993.
[27]
JaswinderPal Singh, Wolf-Dietrich Weber, Anoop Gupta. SPLASH: Stanford Parallel Applications for Shared Memory. Computer Architecture News, 20(1 ):5-44, March 1992.
[28]
Richard L. Sites. Alpha AXP Architecture. Communications of the ACM, 36(2):33-44, February 1993.
[29]
Alan Jay Smith. Cache Memories. Computing Surveys, 14(3):473-530, September 1982.
[30]
SPARC International Inc. The SPARC Architecture Manual, Version 8, 1991.
[31]
SPEC. (entire issue). SPEC Newsletter, 3(4), December 1991.
[32]
Madhusudhan Talluri, Shing Kong, Mark D. Hill, David A. Patterson. Tradeoffs in Supporting Two Page Sizes. In Proc. of the 19th Annual International Symposium on Computer Architecture, pages 415-424, May 1992.
[33]
Madhusudhan Talluri and Mark D. Hill. Surpassing the TLB Performance of Superpages with Less Operating System Support. Computer Sciences Technical Report #1275, University of Wisconsin, July 1994.
[34]
George Taylor, P. Davies, M. Farmwald. The TLB Slice--A Low-Cost High-Speed Address Translation Mechanism. in Proc. of the 17th Annual International Symposium on Computer Architecture, pages 355-363, June 1990.
[35]
Richard Uhlig, David Nagle, Trevor Mudge, Stuart Sechrest. Trap-driven Simulation with Tapeworm II. In Proc. Sixth International Conference on Architectural Support for Programming Language and Operating Systems, (in these proceedings), October 1994.

Cited By

View all
  • (2023)Mosaic Pages: Big TLB Reach with Small PagesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582021(433-448)Online publication date: 25-Mar-2023
  • (2023)FlexPointer: Fast Address Translation Based on Range TLB and Tagged PointersACM Transactions on Architecture and Code Optimization10.1145/357985420:2(1-24)Online publication date: 1-Mar-2023
  • (2023)Contiguitas: The Pursuit of Physical Memory Contiguity in DatacentersProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589079(1-15)Online publication date: 17-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS VI: Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
November 1994
341 pages
ISBN:0897916603
DOI:10.1145/195473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1994

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ASPLOS94
Sponsor:

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)159
  • Downloads (Last 6 weeks)13
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Mosaic Pages: Big TLB Reach with Small PagesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582021(433-448)Online publication date: 25-Mar-2023
  • (2023)FlexPointer: Fast Address Translation Based on Range TLB and Tagged PointersACM Transactions on Architecture and Code Optimization10.1145/357985420:2(1-24)Online publication date: 1-Mar-2023
  • (2023)Contiguitas: The Pursuit of Physical Memory Contiguity in DatacentersProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589079(1-15)Online publication date: 17-Jun-2023
  • (2023)Memory-Efficient Hashed Page Tables2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071061(1221-1235)Online publication date: Feb-2023
  • (2023)On-Demand Triggered Memory Management Unit in Dynamic Binary TranslatorAdvanced Parallel Processing Technologies10.1007/978-981-99-7872-4_17(297-309)Online publication date: 8-Nov-2023
  • (2021)Rebooting Virtual Memory with Midgard2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA52012.2021.00047(512-525)Online publication date: Jun-2021
  • (2021)Exploiting Page Table Locality for Agile TLB Prefetching2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA52012.2021.00016(85-98)Online publication date: Jun-2021
  • (2021)Efficient Huge Page Management with Xpage2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9672050(1317-1326)Online publication date: 15-Dec-2021
  • (2020)A comprehensive analysis of superpage management mechanisms and policiesProceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference10.5555/3489146.3489203(829-842)Online publication date: 15-Jul-2020
  • (2020)CHiRP: Control-Flow History Reuse Prediction2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00023(131-145)Online publication date: Oct-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media