[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Queueing analysis of fault-tolerant computer systems (extended abstract)

Published: 01 May 1986 Publication History

Abstract

Queueing models provide a useful tool for predicting the performance of many service systems including computer systems, telecommunication systems, computer/communication networks and flexible manufacturing systems. Traditional queueing models predict system performance under the assumption that all service facilities provide failure-free service. It must, however, be acknowledged that service facilities do experience failures and that they get repaired. In recent years, it has been increasingly recognized that this separation of performance and reliability/availability models is no longer adequate.
An exact steady-state queueing analysis of such systems is considered by several authors and is carried out by means of generating functions, supplementary variables, imbedded Markov process and renewal theory, or probabilistic techniques [1,2,7,8]. Another approach is approximate, in which it is assumed that the time to reach the steady-state is much smaller than the times to failures/repairs. Therefore, it is reasonable to associate a performance measure (reward) with each state of the underlying Markov (or semi-Markov) model describing the failure/repair behavior of the system. Each of these performance measures is obtained from the steady-state queueing analysis of the system in the corresponding state [3,5].
Earlier we have developed models to derive the distribution of job completion time in a failure-prone environment [3,4]. In these models, we need to consider a possible loss of work due to the occurrence of a failure, i.e., the interrupted job may be resumed or restarted upon service resumption. Note that the job completion time analysis includes the delays due to failures and repairs. The purpose of this paper [9] is to extend our earlier analysis so as to account for the queueing delays. In effect, we consider an exact queueing analysis of fault-tolerant systems in order to obtain the steady-state distribution and the mean of the number of jobs in the system. In particular, we study a system in which jobs arrive in a Poisson fashion and are serviced according to FCFS discipline. The service requirements of the incoming jobs form a sequence of independent and identically distributed random variables. The failure/repair behaviour of the system is modelled by an irreducible continuous-time Markov chain, which is independent of the number of jobs in the system. Let the state-space be {1,2, …,n}. When the computer system is in state i it delivers service at rate ri ≥ 0. Furthermore, depending on the type of the state, the work done on the job is preserved or lost upon entering that state. The actual time required to complete a job depends in a complex way upon the service requirement of the job and the evolution of the state of the system. Note that even though the service requirements of jobs are independent and identically distributed, the actual times required to complete these jobs are neither independent nor identically distributed, and hence the model cannot be reduced to a standard M/G/1 queue [8]. As loss of work due to failures and interruptions is quite a common phenomenon in fault-tolerant computer systems, the model proposed here is of obvious interest.
Using our earlier results on the distribution of job completion time we set up a queueing model and show that it has the block M/G/1 structure. Queueing models with such a structure have been studied by Neuts, Lucantoni and others [6]. We demonstrate the usefulness of our approach by performing the numerical analysis for a system with two processors subject to failures and repairs.

References

[1]
Baceelli, F. and Trivedi, K. S., "Analysis of M/G/2-Standby Redundant System," Performance '88, A. K. Agrawala and S. k. Tripathi (eds.), North Holland, 1983, pp. 457-476.
[2]
Gaver D. P., "A Waiting Line with Interrupted Service, including Priorities," .7. Royal Statistical Society, Series B-24, 1982, pp. 73-90.
[3]
Kulkarni, V. G., Nicola, V. F., Trivedi, K. S. and Smith, R. M., "A Unified Model for the Analysis of Job Completion Time and Performability Measures in Fault-Tolerant Systems," Tech. Report CS-1985-13, Dept. of Computer Science, Duke University, N. C., 1985.
[4]
Kulkarni, V. G., Nicola, V. F. and Trivedi, K. S., "The Completion Time of a Job on Multi-Mode Systems," in review, jr. of Applied Probability.
[5]
Meyer, J., "Closed-Form Solution of Performability," {FEE Trans. on Computer#, 3/ol. C-31, No. 7, 1982, pp. 648-657.
[6]
Neuts, M. F., 'CMatrix Analytic Methods in Queueing Theory," Eeropean Y. of Operationa Re, earth, Vol. 15, 1984, pp. 2-12.
[7]
Neut, M.F. and Lucantoni, D.M., "A Markovian Queue with N Servers Subject to Breakdowns and Repairs," Manafement Science, Vol. 25, No. 9, 1979, pp. 849-861.
[8]
Nicola, V. F., "A Sing}e Server Queue with Mixed Types of Interruptions," to appear in Aeta Informatiea, 1986.
[9]
Nico}a, V. F., Kulkarni, V. G. and Trivedi, K. S., "Queueing Analysis of Fault-Tolerant Systems," accepted in Performance '86, to appear in IEEE Trans. on Software Engineering, 1986.

Cited By

View all
  • (2006)A simple proof for the matrix‐geometric theoremApplied Stochastic Models and Data Analysis10.1002/asm.31500801058:1(25-29)Online publication date: Sep-2006
  • (2011)Analysis of stochastic reaction networks with Markov reward modelsProceedings of the 9th International Conference on Computational Methods in Systems Biology10.1145/2037509.2037517(45-54)Online publication date: 21-Sep-2011

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 14, Issue 1
May 1986
277 pages
ISSN:0163-5999
DOI:10.1145/317531
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMETRICS '86/PERFORMANCE '86: Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
    May 1986
    262 pages
    ISBN:0897911849
    DOI:10.1145/317499
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1986
Published in SIGMETRICS Volume 14, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)59
  • Downloads (Last 6 weeks)8
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2006)A simple proof for the matrix‐geometric theoremApplied Stochastic Models and Data Analysis10.1002/asm.31500801058:1(25-29)Online publication date: Sep-2006
  • (2011)Analysis of stochastic reaction networks with Markov reward modelsProceedings of the 9th International Conference on Computational Methods in Systems Biology10.1145/2037509.2037517(45-54)Online publication date: 21-Sep-2011

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media