[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1792734.1792763guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On verifying fault tolerance of distributed protocols

Published: 29 March 2008 Publication History

Abstract

Distributed systems are composed of processes connected in some network. Distributed systems may suffer from faults: processes may stop, be interrupted, or be maliciously attacked. Fault-tolerant protocols are designed to be resistant to faults. Proving the resistance of protocols to faults is a very challenging problem, as it combines the parameterized setting that distributed systems are based-on, with the need to consider a hostile environment that produces the faults. Considering all the possible fault scenarios for a protocol is very difficult. Thus, reasoning about fault-tolerance protocols utterly needs formal methods.
In this paper we describe a framework for verifying the fault tolerance of (synchronous or asynchronous) distributed protocols. In addition to the description of the protocol and the desired behavior, the user provides the fault type (e.g., failstop, Byzantine) and its distribution (e.g., at most half of the processes are faulty). Our framework is based on augmenting the description of the configurations of the system by a mask describing which processes are faulty. We focus on regular model checking and show how it is possible to compile the input for the model-checking problem to one that takes the faults and their distribution into an account, and perform regular model-checking on the compiled input. We demonstrate the effectiveness of our framework and argue for its generality.

References

[1]
Abdulla, P.A., d'Orso, J., Jonsson, B., Nilsson, M.: Algorithmic improvements in regular model checking. In: Hunt Jr., W.A., Somenzi, F. (eds.) CAV 2003. LNCS, vol. 2725, pp. 236-248. Springer, Heidelberg (2003).
[2]
Abdulla, P.A., Jonsson, B., Nilsson, M., d'Orso, J., Saksena, M.: Regular model checking for LTL(MSO). In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 348-360. Springer, Heidelberg (2004).
[3]
Abdulla, P.A., Jonsson, B., Nilsson, M., d'Orso, J., Saksena, M.: A survey of regular model checking. In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004. LNCS, vol. 3170, pp. 35-48. Springer, Heidelberg (2004).
[4]
Apt, K., Kozen, D.: Limits for automatic verification of finite-state concurrent systems. Information Processing Letters 22(6), 307-309 (1986).
[5]
Arora, A., Gouda, M.G.: Closure and convergence: A foundation of fault-tolerant computing. Software Engineering 19(11), 1015-1027 (1993).
[6]
Attie, P.C., Arora, A., Emerson, E.A.: Synthesis of fault-tolerant concurrent programs. ACM TOPLAS 26, 128-185 (2004).
[7]
Awerbuch, B.: Optimal distributed algorithms for minimum weight spanning tree, counting, leader election and related problems. In: Proc. 19th STOC, pp. 230-240 (1987).
[8]
Baier, C., Bertrand, N., Schnoebelen, P.: On computing fixpoints in well-structured regular model checking, with applications to lossy channel systems. In: Hermann, M., Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 347-361. Springer, Heidelberg (2006).
[9]
Bardin, S., Finkel, A., Leroux, J., Schnoebelen, P.: Flat acceleration in symbolic model checking. In: Peled, D.A., Tsay, Y.-K. (eds.) ATVA 2005. LNCS, vol. 3707, pp. 474-488. Springer, Heidelberg (2005).
[10]
Bouajjani, A., Habermehl, P., Vojnar, T.: Abstract regular model checking. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 372-386. Springer, Heidelberg (2004).
[11]
Büchi, J.R.: Weak second-order arithmetic and finite automata. Zeit. Math. Logik und Grundl. Math. 6, 66-92 (1960).
[12]
Keneddey Space Center. NASA space shuttle launch archive, mission STS-1 (1981) http://science.ksc.nasa.gov/shuttle/missions/sts-1/mission-sts-1.html
[13]
Daliot, A., Dolev, D., Parnas, H.: Linear time byzantine self-stabilizing clock synchronization. In: Proc. of 7th PODC, pp. 7-19 (2003).
[14]
Dolev, D., Strong, H.R.: Authenticated algorithms for byzantine agreement. SIAM Journal on Computing 12, 656-666 (1983).
[15]
Elgaard, J., Klarlund, N., Möller, A.: Mona 1.x: new techniques for WS1S and WS2S. In: Y. Vardi, M. (ed.) CAV 1998. LNCS, vol. 1427, pp. 516-520. Springer, Heidelberg (1998)
[16]
Elgot, C.: Decision problems of finite-automata design and related arithmetics. Trans. Amer. Math. Soc. 98, 21-51 (1961).
[17]
Emerson, E.A., Kahlon, V.: Reducing model checking of the many to the few. In: Proc. 17th CAD, pp. 236-255 (2000).
[18]
Emerson, E.A., Namjoshi, K.S.: On reasoning about rings. IJFCS 14(4), 527-550 (2003).
[19]
Faloutsos, M., Molle, M.: Optimal distributed algorithm for minimum spanning trees revisited. In: Proc. 14th PODC, pp. 231-237 (1995).
[20]
Fang, Y., Piterman, N., Pnueli, A., Zuck, L.: Liveness with invisible ranking. STTT 8(3), 261-279 (2004).
[21]
Fisman, D., Pnueli, A.: Beyond regular model checking. In: Hariharan, R., Mukund, M., Vinay, V. (eds.) FSTTCS 2001. LNCS, vol. 2245, Springer, Heidelberg (2001).
[22]
Habermehl, P., Vojnar, T.: Regular model checking using inference of regular languages. ENTCS 138(3), 21-36 (2005).
[23]
Kesten, Y., Maler, O., Marcus, M., Pnueli, A., Shahar, E.: Symbolic model checking with rich assertional languages. TCS 256, 93-112 (2001).
[24]
Lesens, D., Halbwachs, N., Raymond, P.: Automatic verification of parameterized linear networks of processes. In: Proc. 24th POPL, pp. 346-357 (1997).
[25]
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1996).
[26]
Malekpour, M.R.: A byzantine fault-tolerant self-stabilization synchronization protocol for distributed clock synchronization systems. TR NASA/TM-2006-214322, NASA STI (2006).
[27]
Malekpour, M.R., Sinimiceanu, R.: Comments on the byzantine self-stabilization synchronization protocol: counterexamples. TR NASA/TM-2006-213951, NASA STI, (2006).
[28]
Pnueli, A., Shahar, E.: Liveness and acceleration in parameterized verification. In: Proc. 12th CAV, pp. 328-343 (2000).
[29]
Schlichting, R.D., Schneider, F.B.: Fail-stop processors: An approach to designing fault-tolerant computing systems. Computer Systems 1(3), 222-238 (1983).
[30]
Tanenbaum, A., van Steen, M.: Distributed Systems: Principles and Paradigms. Prentice Hall, Englewood Cliffs (2007).
[31]
Thomas, W.: Automata on infinite objects. Handbook of Theoretical Computer Science, 133- 191 (1990).
[32]
Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. I&C 115(1), 1-37 (1994).
[33]
Wolper, P., Lovinfosse, V.: Verifying properties of large sets of processes with network invariants. In: Proc. Automatic verification methods for finite state systems, pp. 68-80 (1990).

Cited By

View all
  • (2017)Abstracting the geniuses away from failure testingCommunications of the ACM10.1145/315248361:1(54-61)Online publication date: 27-Dec-2017
  • (2017)A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithmsACM SIGPLAN Notices10.1145/3093333.300986052:1(719-734)Online publication date: 1-Jan-2017
  • (2017)A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithmsProceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages10.1145/3009837.3009860(719-734)Online publication date: 1-Jan-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
TACAS'08/ETAPS'08: Proceedings of the Theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems
March 2008
518 pages
ISBN:3540787992
  • Editors:
  • C. R. Ramakrishnan,
  • Jakob Rehof

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 29 March 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Abstracting the geniuses away from failure testingCommunications of the ACM10.1145/315248361:1(54-61)Online publication date: 27-Dec-2017
  • (2017)A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithmsACM SIGPLAN Notices10.1145/3093333.300986052:1(719-734)Online publication date: 1-Jan-2017
  • (2017)A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithmsProceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages10.1145/3009837.3009860(719-734)Online publication date: 1-Jan-2017
  • (2016)Automating Failure Testing Research at Internet ScaleProceedings of the Seventh ACM Symposium on Cloud Computing10.1145/2987550.2987555(17-28)Online publication date: 5-Oct-2016
  • (2015)Lineage-driven Fault InjectionProceedings of the 2015 ACM SIGMOD International Conference on Management of Data10.1145/2723372.2723711(331-346)Online publication date: 27-May-2015
  • (2015)Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataundefinedOnline publication date: 27-May-2015
  • (2014)Tutorial on Parameterized Model Checking of Fault-Tolerant Distributed AlgorithmsAdvanced Lectures of the 14th International School on Formal Methods for Executable Software Models - Volume 848310.1007/978-3-319-07317-0_4(122-171)Online publication date: 16-Jun-2014
  • (2013)Brief announcementProceedings of the 2013 ACM symposium on Principles of distributed computing10.1145/2484239.2484285(119-121)Online publication date: 22-Jul-2013
  • (2009)Behavioral automata composition for automatic topology independent verification of parameterized systemsProceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering10.1145/1595696.1595758(325-334)Online publication date: 24-Aug-2009

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media