Article

AMP: adaptive multi-stream prefetching in a shared cache

Authors:

Binny S. Gill,

Luis Angel D. BathenAuthors Info & Claims

FAST '07: Proceedings of the 5th USENIX conference on File and Storage Technologies

Page 26

Published: 13 February 2007 Publication History

Publisher Site

Abstract

Prefetching is a widely used technique in modern data storage systems. We study the most widely used class of prefetching algorithms known as sequential prefetching. There are two problems that plague the state-of-the-art sequential prefetching algorithms: (i) cache pollution, which occurs when prefetched data replaces more useful prefetched or demand-paged data, and (ii) prefetch wastage, which happens when prefetched data is evicted from the cache before it can be used.

A sequential prefetching algorithm can have a fixed or adaptive degree of prefetch and can be either synchronous (when it can prefetch only on a miss), or asynchronous (when it can also prefetch on a hit). To capture these distinctions we define four classes of prefetching algorithms: Fixed Synchronous (FS), Fixed Asynchronous (FA), Adaptive Synchronous (AS), and Adaptive Asynchronous (AA). We find that the relatively unexplored class of AA algorithms is in fact the most promising for sequential prefetching. We provide a first formal analysis of the criteria necessary for optimal throughput when using an AA algorithm in a cache shared by multiple steady sequential streams. We then provide a simple implementation called AMP, which adapts accordingly leading to near optimal performance for any kind of sequential workload and cache size.

Our experimental set-up consisted of an IBM xSeries 345 dual processor server running Linux using five SCSI disks. We observe that AMP convincingly outperforms all the contending members of the FA, FS, and AS classes for any number of streams, and over all cache sizes. As anecdotal evidence, in an experiment with 100 concurrent sequential streams and varying cache sizes, AMP beats the FA, FS, and AS algorithms by 29-172%, 12-24%, and 21-210% respectively while outperforming OBL by a factor of 8. Even for complex workloads like SPC1-Read, AMP is consistently the best performing algorithm. For the SPC2 Video-on-Demand workload, AMP can sustain at least 25% more streams than the next best algorithm. Finally, for a workload consisting of short sequences, where optimality is more elusive, AMP is able to outperform all the other contenders in overall performance.

Cited By

View all

Zhu CWang FHou B(2019)BPPProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337904(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337904
Yoon SYoun YBurgstaller BKim S(2019)Self-learnable Cluster-based Prefetching Method for DRAM-Flash Hybrid Main Memory ArchitectureACM Journal on Emerging Technologies in Computing Systems10.1145/328493215:1(1-21)Online publication date: 9-Jan-2019
https://dl.acm.org/doi/10.1145/3284932
Yadgar GShor R(2017)Experience from Two Years of Visualizing Flash with SSDPlayerACM Transactions on Storage10.1145/314935613:4(1-24)Online publication date: 17-Nov-2017
https://dl.acm.org/doi/10.1145/3149356
Show More Cited By

Index Terms

AMP: adaptive multi-stream prefetching in a shared cache
1. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Allocation / deallocation strategies
    2. Software system structures
      1. Distributed systems organizing principles

Recommendations

AMP: An Affinity-Based Metadata Prefetching Scheme in Large-Scale Distributed Storage Systems
CCGRID '08: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid

Prefetching is an effective technique for improving file access performance, which can significantly reduce access latency for I/O systems. In distributed storage systems, prefetching for metadata files is critical for the overall system performance. In ...
AMP: program context specific buffer caching
ATEC '05: Proceedings of the annual conference on USENIX Annual Technical Conference

We present Adaptive Multi-Policy disk caching (AMP), which uses multiple caching policies within one application, and adapts both which policies to use and their relative fraction of the cache, based on program-context specific information. AMP ...
TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

FAST '07: Proceedings of the 5th USENIX conference on File and Storage Technologies

February 2007

61 pages

Publisher

USENIX Association

United States

Publication History

Published: 13 February 2007

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
18
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zhu CWang FHou B(2019)BPPProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337904(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337904
Yoon SYoun YBurgstaller BKim S(2019)Self-learnable Cluster-based Prefetching Method for DRAM-Flash Hybrid Main Memory ArchitectureACM Journal on Emerging Technologies in Computing Systems10.1145/328493215:1(1-21)Online publication date: 9-Jan-2019
https://dl.acm.org/doi/10.1145/3284932
Yadgar GShor R(2017)Experience from Two Years of Visualizing Flash with SSDPlayerACM Transactions on Storage10.1145/314935613:4(1-24)Online publication date: 17-Nov-2017
https://dl.acm.org/doi/10.1145/3149356
Pu QLi HZaharia MGhodsi AStoica IArgyraki KIsaacs R(2016)FairRideProceedings of the 13th Usenix Conference on Networked Systems Design and Implementation10.5555/2930611.2930637(393-406)Online publication date: 16-Mar-2016
https://dl.acm.org/doi/10.5555/2930611.2930637
Jiang SDing XXu YDavis K(2013)A Prefetching Scheme Exploiting both Data Layout and Access History on DiskACM Transactions on Storage10.1145/25080109:3(1-23)Online publication date: 1-Aug-2013
https://dl.acm.org/doi/10.1145/2508010
Ding WZhang YKandemir MSon SHollingsworth J(2012)Compiler-directed file layout optimization for hierarchical storage systemsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/2388996.2389052(1-11)Online publication date: 10-Nov-2012
https://dl.acm.org/doi/10.5555/2388996.2389052
Kandemir MYemliha TPrabhakar RJung M(2012)On Urgency of I/O OperationsProceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)10.1109/CCGrid.2012.40(188-195)Online publication date: 13-May-2012
https://dl.acm.org/doi/10.1109/CCGrid.2012.40
Yadgar GFactor MLi KSchuster A(2011)Management of Multilevel, Multiclient Cache Hierarchies with Application HintsACM Transactions on Computer Systems10.1145/1963559.196356129:2(1-51)Online publication date: 1-May-2011
https://dl.acm.org/doi/10.1145/1963559.1963561
Chen WChen HHuang WChen XHuang DHariri SKeahey K(2010)Improving host swapping using adaptive prefetching and paging notifierProceedings of the 19th ACM International Symposium on High Performance Distributed Computing10.1145/1851476.1851515(300-303)Online publication date: 21-Jun-2010
https://dl.acm.org/doi/10.1145/1851476.1851515
Patrick CKandemir MKaraköy MSon SChoudhary AHariri SKeahey K(2010)Cashing in on hints for better prefetching and caching in PVFS and MPI-IOProceedings of the 19th ACM International Symposium on High Performance Distributed Computing10.1145/1851476.1851499(191-202)Online publication date: 21-Jun-2010
https://dl.acm.org/doi/10.1145/1851476.1851499
Show More Cited By

Abstract

Cited By

Index Terms

Recommendations

AMP: An Affinity-Based Metadata Prefetching Scheme in Large-Scale Distributed Storage Systems

AMP: program context specific buffer caching

TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Comments

Information

Published In

Sponsors

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations