[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A Unified Framework for Simulating Markovian Models of Highly Dependable Systems

Published: 01 January 1992 Publication History

Abstract

The authors present a unified framework for simulating Markovian models of highly dependable systems. It is shown that a variance reduction technique called importance sampling can be used to speed up the simulation by many orders of magnitude over standard simulation. This technique can be combined very effectively with regenerative simulation to estimate measures such as steady-state availability and mean time to failure. Moveover, it can be combined with conditional Monte Carlo methods to quickly estimate transient measures such as reliability, expected interval availability, and the distribution of interval availability. The authors show the effectiveness of these methods by using them to simulate large dependability models. They discuss how these methods can be implemented in a software package to compute both transient and steady-state measures simultaneously from the same sample run.

References

[1]
{1} A. Bobbio and K. S. Trivedi, "An aggregation technique for the transient analysis of stiff Markov chains," IEEE Trans. Comput., vol. C-35, pp. 803-814, 1986.
[2]
{2} W. G. Bouricius, W. C. Carter, and P. R. Schneider, "Reliability modeling techniques for self-repairing computer systems," in Proc. ACM Nat. Conf., San Francisco, CA, 1969, pp. 295-309.
[3]
{3} K. L. Chung, Markov Chains With Stationary Transition Probabilities, Second Edition. New York: Springer-Verlag, 1967.
[4]
{4} A. E. Conway and A. Goyal, "Monte Carlo simulation of computer system availability/reliability models," in Proc. Seventeenth Symp. Fault-Tolerant Comput., Pittsburgh, PA, 1987, pp. 230-235.
[5]
{5} A. Costes, J. E. Doucet, C. Landrault, and J. C. Laprie, "SURF--A program for dependability evaluation of complex fault-tolerant computing systems," in Proc. Eleventh Symp. Fault-Tolerant Comput., Portland, ME, 1981, pp. 72-78.
[6]
{6} M. Cottrell, J. C. Fort, and G. Malgouyres, "Large deviations and rare events in the study of stochastic algorithms," IEEE Trans. Automat. Contr., vol. AC-28, pp. 907-920, 1983.
[7]
{7} H. Cramér, Mathematical Methods of Statistics. Princeton, NJ: Princeton University Press, 1946.
[8]
{8} M. A. Crane and D. L. Iglehart, "Simulating stable stochastic systems, III: Regenerative processes and discrete event simulations," Oper. Res., vol. 23, no. 1, pp. 33-45, 1975.
[9]
{9} J. B. Dugan, K. S. Trivedi, M. K. Smotherman, and R. M. Geist, "The hybrid automated reliability predictor," J. Guidance, Contr., Dynamics vol. 9, no. 3, pp. 319-331, 1986.
[10]
{10} B. L. Fox and P. W. Glynn, "Discrete-time conversion for simulating semi-Markov processes," Oper. Res. Lett., vol. 5, pp. 191-196, 1986.
[11]
{11} B. L. Fox and P. W. Glynn, "Discrete-time conversion for finite-horizon Markov processes," SIAM J. Appl. Math., vol. 5, no. 5, pp. 1457-1473, 1990.
[12]
{12} R. M. Geist and K. S. Trivedi, "Ultra-high reliability prediction for fault-tolerant computer systems," IEEE Trans. Comput., vol. C-32, pp. 1118-1127, 1983.
[13]
{13} R. M. Geist and M. K. Smotherman, "Ultrahigh reliability estimates through simulation," in Proc. Annu. Reliability and Maintainability Symp., Atlanta, GA, 1989, pp. 350-355.
[14]
{14} P. W. Glynn and P. Heidelberger, "Bias properties of budget constrained simulations," Oper. Res., vol. 38, no. 5, pp. 801-814, 1990.
[15]
{15} P. W. Glynn and D. L. Iglehart, "Importance sampling for stochastic simulations," Management Sci., vol. 35, no. 11, pp. 1367-1392, 1989.
[16]
{16} A. Goyal, W. C. Carter, E. de Souza e Silva, S. S. Lavenberg, and K. S. Trivedi, "The system availability estimator," in Proc. Sixteenth Symp. Fault-Tolerant Comput., Vienna, Austria, 1986, pp. 84-89.
[17]
{17} A. Goyal, S. S. Lavenberg, and K. S. Trivedi, "Probabilistic modeling of computer system availability," Ann. Oper. Res., vol. 8, pp. 285-306, 1987.
[18]
{18} A. Goyal and S. S. Lavenberg, "Modeling and analysis of computer system availability," IBM J. Res. Develop., vol. 31, pp. 651-664, 1987.
[19]
{19} A. Goyal, P. Heidelberger, and P. Shahabuddin, "Measure specific dynamic importance sampling for availability simulations," in 1987 Winter Simulation Conf. Proc., A. Thesen, H. Grant, and W. D. Kelton, Eds., IEEE Press, 1987, pp. 351-357.
[20]
{20} D. Gross and D. R. Miller, "The randomization technique as a modeling tool and solution procedure for transient Markov processes," Oper. Res., vol. 32, pp. 343-361, 1984.
[21]
{21} J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods. London, England: Methuen, 1964.
[22]
{22} A. Hordijk, D. L. Iglehart, and R. Schassberger, "Discrete time methods for simulating continuous time Markov chains," Adv. Appl. Prob., vol. 8, pp. 772-788, 1976.
[23]
{23} H. Kahn and A. W. Marshall, "Methods of reducing sample size in Monte Carlo computations," J. Oper. Res. Soc. Amer., vol. 1, no. 5, pp. 263-278, 1953.
[24]
{24} S. Karlin and H. M. Taylor, A First Course in Stochastic Processes, Second Edition. New York: Academic, 1975.
[25]
{25} D. E. Knuth, "Big omicron, big omega and big theta," SIGACT News (ACM), vol. 8, no. 2, pp. 18-24, 1976.
[26]
{26} S. S. Lavenberg, T. L. Moeller, and C. H. Sauer, "Concomitant control variables applied to the regenerative simulation of queueing systems," Oper. Res. vol. 27, no. 1, pp. 134-160, 1979.
[27]
{27} P. L'Ecuyer, "Efficient and portable combined random number generators," Commun. ACM, vol. 31, no. 6, pp. 742-774, 1988.
[28]
{28} E. L. Lehmann, Theory of Point Estimation. New York: Wiley, 1983.
[29]
{29} E. E. Lewis and F. Böhm, "Monte Carlo simulation of Markov unreliability models," Nuclear Eng. and Design, vol. 77, pp. 49-62, 1984.
[30]
{30} C. A. Liceaga and D. P. Siewiorek, "Toward automatic Markov reliability modelling of computer architectures," NASA Tech. Memo. 89009, 1986.
[31]
{31} S. V. Makam, A. Avizienis, and G. Grusas, UCLA ARIES 82 User's Guide. Also Tech. Rep. CSD-820830, Univ. of California at Los Angeles, 1982.
[32]
{32} A. M. Johnson, Jr. and M. Malek, "Survey of software tools for evaluating reliability, availability, and serviceability," ACM Comput. Surveys, vol. 20, no. 4, pp. 227-269, 1988.
[33]
{33} J. F. Meyer, "On evaluating the performability of degradable computing systems," IEEE Trans. Comput., vol. C-29, no. 8, pp. 720-731, 1980.
[34]
{34} R. G. Miller, "The Jackknife--A review," Biometrika, vol. 61, pp. 1-15, 1974.
[35]
{35} R. R. Muntz, E. de Souza e Silva, and A. Goyal, "Bounding availability of repairable computer systems," IEEE Trans. Comput., vol. 38, no. 12, pp. 1714-1723, 1989.
[36]
{36} V. F. Nicola, "Lumping in Markov reward processes," IBM Res. Rep. RC 14719, Yorktown Heights, NY, 1989.
[37]
{37} V. F. Nicola, M. K. Nakayama, P. Heidelberger, and A. Goyal, "Fast simulation of dependability models with general failure, repair and maintenance processes," in Proc. Twentieth Symp. Fault-Tolerant Comput., Newcastle upon Tyne, England, 1990, pp. 491-498.
[38]
{38} S. M. Ross, Applied Probability Models with Optimization Applications. San Francisco, CA: Holden-Day, 1970.
[39]
{39} S. M. Ross and Z. Schechner, "Using simulation to estimate first passage distribution," Management Sci., vol. 31, no. 2, pp. 224-234, 1985.
[40]
{40} S. Parekh and J. Walrand, "A quick simulation method for excessive backlogs in networks of queues," IEEE Trans. Automat. Contr., vol. 34, no. 1, pp. 54-66, 1989.
[41]
{41} R. A. Sahner and K. S. Trivedi, "Performance and reliability analysis using acyclic directed graphs," IEEE Trans. Software Eng., vol. SE-13, no. 10, pp. 1105-1114, 1987.
[42]
{42} W. H. Sanders and J. F. Meyer, "METASAN: A performability evaluation tool based on stochastic activity networks," in Proc. 1986 Fall Joint Comput. Conf., AFIPS, NY, 1986, pp. 807-816.
[43]
{43} P. Shahabuddin, V. F. Nicola, P. Heidelberger, A. Goyal, and P. W. Glynn, "Variance reduction in mean time to failure simulations," in 1988 Winter Simulation Conf. Proc., M. A. Abrams, P. L. Haigh, and J. C. Comfort, Eds., IEEE Press, 1988, pp. 491-499.
[44]
{44} P. Shahabuddin, "Simulation and analysis of highly reliable systems," Ph.D. dissertation, Dep. Oper. Res., Stanford Univ., 1990.
[45]
{45} D. Siegmund, "Importance sampling in the Monte Carlo study of sequential tests," Ann. Statistics, vol. 4, pp. 673-684, 1976.
[46]
{46} W. L. Smith, "Regenerative stochastic processes," Proc. Roy. Soc. A., vol. 232, pp. 6-31, 1955.
[47]
{47} J. Walrand, "Quick simulation of rare events in queueing networks," in Proc. Second Int. Workshop Appl. Math. and Performance/Reliability Models of Comput./Commun. Syst., G. Iazeolla, P. J. Courtois, and O. J. Boxma, Eds. Amsterdam, North Holland, 1987, pp. 275-286.

Cited By

View all
  • (2024)Overconservativeness of Variance-Based Efficiency Criteria and Probabilistic Efficiency in Rare-Event SimulationManagement Science10.1287/mnsc.2023.497370:10(6852-6873)Online publication date: 1-Oct-2024
  • (2023)Efficiency of Estimating Functions of Means in Rare-Event ContextsProceedings of the Winter Simulation Conference10.5555/3643142.3643171(351-362)Online publication date: 10-Dec-2023
  • (2022)Density Estimators of the Cumulative Reward Up to a Hitting Time to a Rarely Visited Set of a Regenerative SystemProceedings of the Winter Simulation Conference10.5555/3586210.3586221(121-132)Online publication date: 11-Dec-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 41, Issue 1
January 1992
129 pages
ISSN:0018-9340
Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 January 1992

Author Tags

  1. Markov processes
  2. Monte Carlo methods
  3. Monte Carlo methods.
  4. digital simulation
  5. highly dependable systems
  6. importance sampling
  7. mean time to failure
  8. regenerative simulation
  9. reliability
  10. simulating Markovian models
  11. software package
  12. steady-state availability
  13. steady-state measures
  14. transient measures
  15. unified framework
  16. variance reduction technique

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Overconservativeness of Variance-Based Efficiency Criteria and Probabilistic Efficiency in Rare-Event SimulationManagement Science10.1287/mnsc.2023.497370:10(6852-6873)Online publication date: 1-Oct-2024
  • (2023)Efficiency of Estimating Functions of Means in Rare-Event ContextsProceedings of the Winter Simulation Conference10.5555/3643142.3643171(351-362)Online publication date: 10-Dec-2023
  • (2022)Density Estimators of the Cumulative Reward Up to a Hitting Time to a Rarely Visited Set of a Regenerative SystemProceedings of the Winter Simulation Conference10.5555/3586210.3586221(121-132)Online publication date: 11-Dec-2022
  • (2022)Analysis of non-Markovian repairable fault trees through rare event simulationInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-022-00675-x24:5(821-841)Online publication date: 1-Oct-2022
  • (2020)Comparing regenerative-simulation-based estimators of the distribution of the hitting time to a rarely visited setProceedings of the Winter Simulation Conference10.5555/3466184.3466230(421-432)Online publication date: 14-Dec-2020
  • (2020)An efficient statistical model checker for nondeterminism and rare eventsInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-020-00563-222:6(759-780)Online publication date: 1-Dec-2020
  • (2019)Efficient estimation of the mean hitting time to a set of a regenerative systemProceedings of the Winter Simulation Conference10.5555/3400397.3400431(416-427)Online publication date: 8-Dec-2019
  • (2018)Using regenerative simulation to calibrate exponential approximations to risk measures of hitting times to rarely visited setsProceedings of the 2018 Winter Simulation Conference10.5555/3320516.3320735(1802-1813)Online publication date: 9-Dec-2018
  • (2018)Monte Carlo estimation of economic capitalProceedings of the 2018 Winter Simulation Conference10.5555/3320516.3320729(1754-1765)Online publication date: 9-Dec-2018
  • (2017)On the estimation of the mean time to failure by simulationProceedings of the 2017 Winter Simulation Conference10.5555/3242181.3242334(1-12)Online publication date: 3-Dec-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media