[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1254882.1254887acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Modeling the relative fitness of storage

Published: 12 June 2007 Publication History

Abstract

Relative fitness is a new black-box approach to modeling the performance of storage devices. In contrast with an absolute model that predicts the performance of a workload on a given storage device, a relative fitness model predicts performance differences between a pair of devices. There are two primary advantages to this approach. First, because are lative fitness model is constructed for a device pair, the application-device feedback of a closed workload can be captured (e.g., how the I/O arrival rate changes as the workload moves from device A to device B). Second, a relative fitness model allows performance and resource utilization to be used in place of workload characteristics. This is beneficial when workload characteristics are difficult to obtain or concisely express (e.g., rather than describe the spatio-temporal characteristics of a workload, one could use the observed cache behavior of device A to help predict the performance of B.
This paper describes the steps necessary to build a relative fitness model, with an approach that is general enough to be used with any black-box modeling technique. We compare relative fitness models and absolute models across a variety of workloads and storage devices. On average, relative fitness models predict bandwidth and throughput within 10-20% and can reduce prediction error by as much as a factor of two when compared to absolute models.

References

[1]
G. A. Alvarez, J. Wilkes, E. Borowsky, S. Go, T. H. Romer, R. Becker-Szendy, R. Golding, A. Merchant, M. Spasojevic, and A. Veitch. Minerva: an automated resource provisioning tool for large-scale storage systems. ACM Transactions on Computer Systems, 19(4):483--518. ACM, November 2001.
[2]
E. Anderson. Simple table-based modeling of storage devices. SSP Technical Report HPL-SSP-2001-4. HP Laboratories, July 2001.
[3]
E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: running circles around storage administration. Conference on File and Storage Technologies (Monterey, CA, 28-30 January 2002), pages 175--188. USENIX Association, 2002.
[4]
N. Appliance. PostMark: A New File System Benchmark. http://www.netapp.com.
[5]
E. Borowsky, R. Golding, A. Merchant, L. Schreier, E. Shriver, M. Spasojevic, and J. Wilkes. Using attribute-managed storage to achieve QoS. International Workshop on Quality of Service (Pittsburgh, PA, 21-23 March 1997). IFIP, 1997.
[6]
L. Breiman, J. H.Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth.
[7]
M. J. Carey, D. J. DeWitt, M. J. Franklin, N. E. Hall, M. L. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. G. Tsatalos, S. J. White, and M. J. Zwilling. Shoring up persistent applications. ACM SIGMOD International Conference on Management of Data (Minneapolis, MN, 24-27 May 1994). Published as SIGMOD Record, 23(2):383--394. ACM Press, 1994.
[8]
D. J. Futuyma. Evolutionary Biology. Third edition. SUNY, Stony Brook. Sinauer. December 1998.
[9]
G. R. Ganger. Generating representative synthetic workloads: an unsolved problem. International Conference on Management and Performance Evaluation of Computer Systems (Nashville, TN), pages 1263--1269, 1995.
[10]
G. R. Ganger and Y. N. Patt. Using system-level models to evaluate I/O subsystem designs. IEEE Transactions on Computers, 47(6):667--678, June 1998.
[11]
G. R. Ganger, J. D. Strunk, and A. J. Klosterman. Self-Storage: brick-based storage with automated administration. Technical Report CMU-CS-03-178. Carnegie Mellon University, August 2003.
[12]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Verlag. 2001.
[13]
Intel. iSCSI. www.sourceforge.net/projects/intel-iscsi.
[14]
T. Kelly, I. Cohen, M. Goldszmidt, and K. Keeton. Inducing models of black-box storage arrays. Technical report HPL-2004-108. HP, June 2004.
[15]
Z. Kurmas and K. Keeton. Using the distiller to direct the development of self-configuration software. International Conference on Autonomic Computing (New York, NY, 17-18 May 2004), pages 172--179. IEEE, 2004.
[16]
Z. Kurmas, K. Keeton, and K. Mackenzie. Synthesizing representative I/O workloads using iterative distillation. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Orlando, FL, 12-15 October 2003). IEEE/ACM, 2003.
[17]
A. Merchant and P. S. Yu. Analytic modeling of clustered RAID with mapping based on nearly random permutation. IEEE Transactions on Computers, 45(3):367--373, March 1996.
[18]
T. M. Mitchell. Machine Learning. McGraw-Hill, 1997.
[19]
F. I. Popovici, A. C. A. Dusseau, and R. H. A. Dusseau. Robust, portable I/O scheduling with the disk mimic. USENIX Annual Technical Conference (San Antonio, TX, 09-14 June 2003), pages 297--310. IEEE, 2003.
[20]
C. Ruemmler and J. Wilkes. An introduction to disk drive modeling. IEEE Computer, 27(3):17--28, March 1994.
[21]
J. Satran. iSCSI. http://www.ietf.org/rfc/rfc3720.txt.
[22]
E. Shriver, A. Merchant, and J. Wilkes. An analytic behavior model for disk drives with readahead caches and request reordering. ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (Madison, WI, 22-26 June 1999). Published as ACM SIGMETRICS Performance Evaluation Review, 26(1):182--191. ACM Press, 1990.
[23]
Transaction Processing Performance Council. TPC Benchmark C. http://www.tpc.org/tpcc.
[24]
M. Uysal, G. A. Alvarez, and A. Merchant. A modular, analytical throughput model for modern disk arrays. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Cincinnati, OH, 15-18 August 2001), pages 183--192. IEEE, 2001.
[25]
E. Varki, A. Merchant, J. Xu, and X. Qiu. Issues and challenges in the performance analysis of real disk arrays. Transactions on Parallel and Distributed Systems, 15(6):559--574. IEEE, June 2004.
[26]
M. Wang, A. Ailamaki, and C. Faloutsos. Capturing the spatio-temporal behavior of real traffic data. IFIP WG 7.3 Symposium on Computer Performance (Rome, Italy, 23-27 September 2002). Published as Performance Evaluation, 49(1-4):147--163, 2002.
[27]
M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with CART models. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Volendam, The Netherlands, 05-07 October 2004), pages 588--595. IEEE/ACM, 2004.
[28]
M. Wang, T. Madhyastha, N. H. Chan, S. Papadimitriou, and C. Faloutsos. Data mining meets performance evaluation: fast algorithms for modeling bursty traffic. International Conference on Data Engineering (San Jose, CA, 26-01 March 2002), pages 507--516. IEEE, 2002.

Cited By

View all
  • (2020)LAMBDAACM Transactions on Embedded Computing Systems10.1145/339085519:4(1-31)Online publication date: 21-Jun-2020
  • (2020)Network-level Design Space Exploration of Resource-constrained Networks-of-SystemsACM Transactions on Embedded Computing Systems10.1145/338791819:4(1-26)Online publication date: 21-Jun-2020
  • (2018)Pacaca: Mining Object Correlations and Parallelism for Enhancing User Experience with Cloud Storage2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2018.00036(293-305)Online publication date: Sep-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
June 2007
398 pages
ISBN:9781595936394
DOI:10.1145/1254882
  • cover image ACM SIGMETRICS Performance Evaluation Review
    ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
    SIGMETRICS '07 Conference Proceedings
    June 2007
    382 pages
    ISSN:0163-5999
    DOI:10.1145/1269899
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CART
  2. black-box
  3. modeling
  4. storage

Qualifiers

  • Article

Conference

SIGMETRICS07

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2020)LAMBDAACM Transactions on Embedded Computing Systems10.1145/339085519:4(1-31)Online publication date: 21-Jun-2020
  • (2020)Network-level Design Space Exploration of Resource-constrained Networks-of-SystemsACM Transactions on Embedded Computing Systems10.1145/338791819:4(1-26)Online publication date: 21-Jun-2020
  • (2018)Pacaca: Mining Object Correlations and Parallelism for Enhancing User Experience with Cloud Storage2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2018.00036(293-305)Online publication date: Sep-2018
  • (2016)Predicting SQL Query Execution Time for Large Data VolumeProceedings of the 20th International Database Engineering & Applications Symposium10.1145/2938503.2938552(378-385)Online publication date: 11-Jul-2016
  • (2016)Internal Parallelism of Flash Memory-Based Solid-State DrivesACM Transactions on Storage10.1145/281837612:3(1-39)Online publication date: 12-May-2016
  • (2016)Inside-Out: Reliable Performance Prediction for Distributed Storage Systems in the Cloud2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)10.1109/SRDS.2016.025(127-136)Online publication date: Sep-2016
  • (2016)Selecting resources for distributed dataflow systems according to runtime targets2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2016.7820629(1-8)Online publication date: Dec-2016
  • (2016)Filer Response Time Prediction Using Adaptively-Learned Forecasting Models Based on Counter Time Series Data2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA.2016.0012(13-18)Online publication date: Dec-2016
  • (2015)Dynamic provisioning of storage workloadsProceedings of the 29th Usenix Conference on Large Installation System Administration10.5555/2907890.2907892(13-24)Online publication date: 8-Nov-2015
  • (2015)Automatic Cloud I/O Configurator for I/O Intensive Parallel ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.237827726:12(3275-3288)Online publication date: 1-Dec-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media