[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3559009.3569677acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

DSDP: Dual Stream Data Prefetcher

Published: 27 January 2023 Publication History

Abstract

Hardware prefetching is an important DRAM latency hiding technology. Designing prefetchers to maximize system performance often requires a delicate balance between coverage and accuracy. As the number of cores increases, the accuracy of the prefetching algorithm becomes more important. Separating streams based on memory access instructions is an effective way to improve accuracy. However, this technique may lose prefetch opportunities by losing cross-PC relationships, and even reduce algorithm coverage.
In this paper, we propose a spatial prefetcher called Dual Stream Data Prefetcher (DSDP). DSDP achieves high coverage and accuracy with two designs. First, DSDP improves prefetching accuracy by PC localizing. Second, DSDP simultaneously learns cross-PC relationships from the spatial localized stream to compensate for the loss of prefetching opportunities brought by PC localization.
Our experimental results show that, DSDP improves system performance by 41.9% over a baseline with no data prefetcher and 4.1% over the state-of-the-art spatial data prefetcher.

References

[1]
2015. 2nd data prefetching championship. http://comparch-conf.gatech.edu/dpc2/.
[2]
2017. Champsim simulator. https://github.com/ChampSim/.
[3]
2017. SPEC 2006 traces. https://www.dropbox.com/sh/pgmnzfr3hurlutq/AACciuebRwSAOzhJkmj5SEXBa/CRC2_trace?dl=0&subfolder_nav_tracking=1.
[4]
2019. 3rd data prefetching championship. https://dpc3.compas.cs.stonybrook.edu/.
[5]
Mohammad Bakhshalipour, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2018. Domino Temporal Data Prefetcher. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 131--142.
[6]
Mohammad Bakhshalipour, Mehran Shakerinava, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2019. Bingo spatial data prefetcher. Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 2019, February (2019), 399--411.
[7]
Rahul Bera, Konstantinos Kanellopoulos, Anant Nori, Taha Shahroodi, Sreenivas Subramoney, and Onur Mutlu. 2021. Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning. In MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, Greece, October 18--22, 2021. ACM, 1121--1137.
[8]
Rahul Bera, Anant V. Nori, Onur Mutlu, and Sreenivas Subramoney. 2019. DSPatch: Dual Spatial Pattern Prefetcher. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019, Columbus, OH, USA, October 12--16, 2019. ACM, 531--544.
[9]
Eshan Bhatia, Gino Chacon, Seth H. Pugsley, Elvira Teran, Paul V. Gratz, and Daniel A. Jiménez. 2019. Perceptron-based prefetch filtering. In Proceedings of the 46th International Symposium on Computer Architecture, ISCA 2019, Phoenix, AZ, USA, June 22--26, 2019, Srilatha Bobbie Manne, Hillery C. Hunter, and Erik R. Altman (Eds.). ACM, 1--13.
[10]
Pedro Díaz and Marcelo Cintra. 2009. Stream chaining: Exploiting multiple levels of correlation in data prefetching. Proceedings - International Symposium on Computer Architecture (2009), 81--92.
[11]
Eiman Ebrahimi, Onur Mutlu, Chang Joo Lee, and Yale N. Patt. 2009. Coordinated control of multiple prefetchers in multi-core systems. In 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), December 12--16, 2009, New York, New York, USA, David H. Albonesi, Margaret Martonosi, David I. August, and José F. Martínez (Eds.). ACM, 316--326.
[12]
John W. C. Fu, Janak H. Patel, and Bob L. Janssens. 1992. Stride directed prefetching in scalar processors. In Proceedings of the 25th Annual International Symposium on Microarchitecture, Portland, Oregon, USA, November 1992, Wen-mei W. Hwu (Ed.). ACM / IEEE Computer Society, 102--110.
[13]
Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, and Parthasarathy Ranganathan. 2018. Learning Memory Access Patterns. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10--15, 2018 (Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 1924--1933. http://proceedings.mlr.press/v80/hashemi18a.html
[14]
Wim Heirman, Kristof Du Bois, Yves Vandriessche, Stijn Eyerman, and Ibrahim Hur. 2018. Near-side prefetch throttling: adaptive prefetching for high-performance many-core processors. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018, Limassol, Cyprus, November 01--04, 2018, Skevos Evripidou, Per Stenström, and Michael F. P. O'Boyle (Eds.). ACM, 28:1--28:11.
[15]
John L. Henning. 2006. SPEC CPU2006 Benchmark Descriptions. SIGARCH Comput. Archit. News 34, 4 (sep 2006), 1--17.
[16]
Akanksha Jain and Calvin Lin. 2013. Linearizing Irregular Memory Accesses for Improved Correlated Prefetching. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (Davis, California) (MICRO-46). Association for Computing Machinery, New York, NY, USA, 247--259.
[17]
Víctor Jiménez, Roberto Gioiosa, Francisco J. Cazorla, Alper Buyuktosunoglu, Pradip Bose, and Francis P. O'Connell. 2012. Making data prefetch smarter: adaptive prefetching on POWER7. In International Conference on Parallel Architectures and Compilation Techniques, PACT '12, Minneapolis, MN, USA - September 19 -- 23, 2012, Pen-Chung Yew, Sangyeun Cho, Luiz DeRose, and David J. Lilja (Eds.). ACM, 137--146.
[18]
Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path Confidence Based Lookahead Prefetching. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (Taipei, Taiwan) (MICRO-49). IEEE Press, Article 60, 12 pages.
[19]
Sushant Kondguli and Michael Huang. 2018. Division of Labor: A More Effective Approach to Prefetching. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 83--95.
[20]
Pierre Michaud. 2016. Best-offset hardware prefetching. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). 469--480.
[21]
K.J. Nesbit, A.S. Dhodapkar, and J.E. Smith. 2004. AC/DC: an adaptive data cache prefetcher. In Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004. 135--145.
[22]
K. J. Nesbit and J. E. Smith. 2005. Data Cache Prefetching Using a Global History Buffer. IEEE Micro 25, 1 (2005), 90--97.
[23]
Samuel Pakalapati and Biswabandan Panda. 2020. Bouquet of Instruction Pointers: Instruction Pointer Classifier-based Spatial Hardware Prefetching. Proceedings - International Symposium on Computer Architecture 2020-May (2020), 118--131.
[24]
Biswabandan Panda. 2016. SPAC: A Synergistic Prefetcher Aggressiveness Controller for Multi-Core Systems. IEEE Trans. Computers 65, 12 (2016), 3740--3753.
[25]
Seth H. Pugsley, Zeshan Chishti, Chris Wilkerson, Peng-fei Chuang, Robert L. Scott, Aamer Jaleel, Shih-Lien Lu, Kingsum Chow, and Rajeev Balasubramonian. 2014. Sandbox Prefetching: Safe run-time evaluation of aggressive prefetchers. In 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014, Orlando, FL, USA, February 15--19, 2014. IEEE Computer Society, 626--637.
[26]
Manjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H. Pugsley, and Zeshan Chishti. 2015. Efficiently prefetching complex address patterns. In Proceedings of the 48th International Symposium on Microarchitecture, MICRO 2015, Waikiki, HI, USA, December 5--9, 2015, Milos Prvulovic (Ed.). ACM, 141--152.
[27]
Zhan Shi, Akanksha Jain, Kevin Swersky, Milad Hashemi, Parthasarathy Ranganathan, and Calvin Lin. 2021. A hierarchical neural model of data prefetching. In ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19--23, 2021, Tim Sherwood, Emery D. Berger, and Christos Kozyrakis (Eds.). ACM, 861--873.
[28]
Stephen Somogyi, Thomas Wenisch, Michael Ferdman, and Babak Falsafi. 2011. Spatial memory streaming. Journal of Instruction-Level Parallelism 13 (2011), 1--26.
[29]
V. Srinivasan, E.S. Davidson, G.S. Tyson, M.J. Charney, and T.R. Puzak. 2001. Branch history guided instruction prefetching. In Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture. 291--300.
[30]
Thomas F. Wenisch, Michael Ferdman, Anastasia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2009. Practical off-chip meta-data for temporal memory streaming. In 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 14--18 February 2009, Raleigh, North Carolina, USA. IEEE Computer Society, 79--90.
[31]
T. F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki, and B. Falsafi. 2005. Temporal Streaming of Shared Memory. Acm Sigarch Computer Architecture News 33, 2 (2005), 222--233.
[32]
Hao Wu, Krishnendra Nathella, Joseph Pusdesris, Dam Sunwoo, Akanksha Jain, and Calvin Lin. 2019. Temporal Prefetching Without the Off-Chip Metadata. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019, Columbus, OH, USA, October 12--16, 2019. ACM, 996--1008.
[33]
Hao Wu, Krishnendra Nathella, Dam Sunwoo, Akanksha Jain, and Calvin Lin. 2019. Efficient metadata management for irregular data prefetching. In Proceedings of the 46th International Symposium on Computer Architecture, ISCA 2019, Phoenix, AZ, USA, June 22--26, 2019, Srilatha Bobbie Manne, Hillery C. Hunter, and Erik R. Altman (Eds.). ACM, 449--461.

Cited By

View all
  • (2024)Chimera: Leveraging Hybrid Offsets for Efficient Data PrefetchingProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3689613(144-155)Online publication date: 14-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '22: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques
October 2022
569 pages
ISBN:9781450398688
DOI:10.1145/3559009
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IFIP WG 10.3: IFIP WG 10.3
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. caching
  2. data prefetching
  3. microarchitecture

Qualifiers

  • Research-article

Funding Sources

Conference

PACT '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)65
  • Downloads (Last 6 weeks)6
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Chimera: Leveraging Hybrid Offsets for Efficient Data PrefetchingProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3689613(144-155)Online publication date: 14-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media