[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

The failure detector abstraction

Published: 04 February 2011 Publication History

Abstract

A failure detector is a fundamental abstraction in distributed computing. This article surveys this abstraction through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. In particular, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlight some limitations of the failure detector abstraction along each of the dimensions.

References

[1]
Afek, Y. and Nir, I. 2008. Failure detectors in loosely named systems. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC). 65--74.
[2]
Aguilera, Delporte-Gallet, Fauconnier, and Toueg. 2001. Stable leader election. In Proceedings of the International Symposium on Distributed Computing (DISC).
[3]
Aguilera, Delporte-Gallet, Fauconnier, and Toueg. 2003. On implementing omega with weak reliability and synchrony assumptions. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC).
[4]
Aguilera, M. and Toueg, S. 1998. Failure detection and randomization: A hybrid approach to solve consensus. SIAM J. Comput. 28.
[5]
Aguilera, M. K., Chen, W., and Toueg, S. 1998. Failure detection and consensus in the crash-recovery model. In Proceedings of the 12th International Symposium on Distributed Computing (DISC). 231--245.
[6]
Aguilera, M. K., Chen, W., and Toueg, S. 1999. Using the heartbeat failure detector for quiescent reliable communication and consensus in partitionable networks. Theor. Comput. Sci. 220, 1, 3--30.
[7]
Aguilera, M. K., Chen, W., and Toueg, S. 2000a. Failure detection and consensus in the crash recovery model. Distrib. Comput. 13, 2, 99--125.
[8]
Aguilera, M. K., Chen, W., and Toueg, S. 2000b. On quiescent reliable communication. SIAM J. Comput. 29, 6, 2040--2073.
[9]
Aguilera., M. K., Chen, W., and Toueg, S. 2000c. On quiescent reliable communication. SIAM J. Comput. 29, 6, 2040--2073.
[10]
Aguilera, M. K., Delporte-Gallet, C., Fauconnier, H., and Toueg, S. 2000d. Thrifty generic broadcast. In Proceedings of the 14th International Symposium on Distributed Computing (DISC). Lecture Notes in Computer Science, vol. 1914. Springer, 268--282.
[11]
Aguilera, M. K., Le Lann, G., and Toueg, S. 2002. On the impact of fast failure detectors on real-time fault-tolerant systems. In Proceedings of the International Symposium on Distributed Computing (DISC). 354--370.
[12]
Arora, A. and Kulkarni, S. S. 1998. Detectors and correctors: A theory of fault-tolerance components. In Proceedings of the IEEE International Conference on Distributed Computing Systems.
[13]
Attiya, H., Bar-Noy, A., and Dolev, D. 1995. Sharing memory robustly in message-passing systems. J. ACM 42, 1, 124--142.
[14]
Attiya, H., Bar-Noy, A., Dolev, D., Peleg, D., and Reischuk, R. 1990. Renaming in an asynchronous environment. J. ACM 37, 3, 524--548.
[15]
Attiya, H. and Welch, J. L. 2004. Distributed Computing: Fundamentals, Simulations and Advanced Topics (2nd edition). Wiley.
[16]
Barborak, M., Dahbura, A., and Malek, M. 1993. The consensus problem in fault-tolerant computing. ACM Comput. Surv. 25, 2, 171--220.
[17]
Beauquier, J. and Kekkonen-Moneta, S. 1997. Fault-Tolerance and self-stabilization: Impossibility results and solutions using self-stabilizing failure detectors. Int. J. Syst. Sci. 28, 11, 1177--1187.
[18]
Ben-Or, M. 1983. Another advantage of free choice: Completely asynchronous agreement protocols. In Proceedings of the 2nd Annual ACM Symposium on Principles of Distributed Computing. 27--30.
[19]
Bernstein, P., Hadzilacos, V., and Goodman, N. 1987. Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading, MA.
[20]
Borowsky, E. and Gafni, E. 1993. Generalized FLP impossibility result for t-resilient asynchronous computations. In Proceedings of the 25th ACM Symposium on Theory of Computing (STOC). 91--100.
[21]
Brasileiro, F., Greve, F., Mostéfaoui, A., and Raynal, M. 2000. Consensus in one communication step. Tech. rep. PI-1321, IRISA, Rennes, France.
[22]
Chandra, T. D., Hadzilacos, V., and Toueg, S. 1996. The weakest failure detector for solving consensus. J. ACM 43, 4, 685--722.
[23]
Chandra, T. D. and Toueg, S. 1996. Unreliable failure detectors for reliable distributed systems. J. ACM 43, 2, 225--267.
[24]
Chandy, K. M. and Misra, J. 1988. Parallel Program Design: A Foundation. Addison-Wesley, Reading, MA.
[25]
Charron-Bost, B., Guerraoui, R., and Schiper, A. 2000. Synchronous system and perfect failure detector: Solvability and efficiency issues. In International Conference on Dependable Systems and Networks.
[26]
Charron-Bost, B. and Schiper, A. 2006. The “heard-of” model: Unifying all benign faults. Tech. rep., EPFL.
[27]
Chaudhuri, S. 1990. Agreement is harder than consensus: Set consensus problems in totally asynchronous systems. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC).
[28]
Chen, W., Toueg, S., and Aguilera, M. K. 2000. On the quality of service of failure detectors. In Proceedings of the International Conference on Dependable Systems and Networks (DSN'00). IEEE Computer Society Press.
[29]
Chu, F. 1998. Reducing Ω to ⋄ W. Inf. Process. Lett. 67, 289--293.
[30]
Cristian, F. and Fetzer, C. 1999. The timed asynchronous distributed system model. IEEE Trans. Parallel Distrib. Syst. 10, 6.
[31]
Delporte-Gallet, C., Fauconnier, G., and Freiling, F. C. 2005a. Revisiting failure detection and consensus in omission failure environments. In Proceedings of the International Conference on Theoretical Aspects of Computing (ICTAC'03), 2nd International Colloquium, D. V. Hung and M. Wirsing, Eds. Lecture Notes in Computer Science, vol. 3722. Springer, 394--408.
[32]
Delporte-Gallet, C., Fauconnier, H., and Guerraoui, R. 2003. Shared memory vs message passing. Tech. rep. IC/2003/77, EPFL. http://icwww.epfl.ch/publications/.
[33]
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., Hadzilacos, V., Kouznetsov, P., and Toueg, S. 2004. The weakest failure detectors to solve certain fundamental problems in distributed computing. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC). 338--346.
[34]
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., and Kouznetsov, P. 2005b. Mutual exclusion in asynchronous systems with failure detectors. J. Parall. Dustrib. Comput. 65, 4, 492--505.
[35]
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., and Tielmann, A. 2008. The weakest failure detector for message passing set-agreement. In Proceedings of the International Symposium on Distriburted Computing (DISC). 109--120.
[36]
Dijkstra, E. W. 1974. Self stabilizing systems in spite of distributed control. Comm. ACM 17, 11, 643--644.
[37]
Dijkstra, E. W., Feijen, W. H. J., and van Gasteren, A. J. M. 1983. Derivation of a termination detection algorithm for distributed computations. Inf. Process. Lett. 16, 5, 217--219.
[38]
Dolev, D., Dwork, C., and Stockmeyer, L. 1987. On the minimal synchronism needed for distributed consensus. J. ACM 34, 1, 77--97.
[39]
Dolev, D., Friedmann, R., Keidar, I., and Malkhi, D. 1997. Failure detectors in omission failure environments. In Proceedings of the ACM Symposium on Principles of Distributed Computing. (PODC).e Detectors in Omission Failure Environments.
[40]
Dolev, S. 2000. Self-Stabilization. MIT Press.
[41]
Doudou, A., Garbinato, B., and Guerraoui, R. 2002. Encapsulating failure detection: From crash to Byzantine failures. In Proceedings of the International Conference on Reliable Software Technologies.
[42]
Doudou, A., Garbinato, B., and Guerraoui, R. 2005. Tolerating arbitrary failures with state machine replication. In Dependable Computing Systems: Paradigms, Performance Issues and Applications, 1st ed., H. Diab and A. Zomaya, Eds. Addison-Wesley, Reading, MA, Chapter 2.
[43]
Doudou, A., Garbinato, B., Guerraoui, R., and Schiper, A. 1999. Muteness failure detectors: Specification and implementation. In Proceedings of the 3rd European Dependable Computing Conference (EDCC'99). Lecture Notes in Computer Science, vol. 1667. Springer, 71--87.
[44]
Dwork, C., Lynch, N., and Stockmeyer, L. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2, 288--323.
[45]
Eisler, J., Hadzilacos, V., and Toueg, S. 2007. The weakest failure detector to solve nonuniform consensus. Distrib. Comput. 19, 4, 335--359.
[46]
Fetzer, C., Schmid, U., and Süsskraut, M. 2005. On the possibility of consensus in asynchronous systems with finite average response times. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS). IEEE Computer Society, 271--280.
[47]
Fischer, M. J., Lynch, N. A., and Paterson, M. S. 1985. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2, 374--382.
[48]
Freiling, F. C. and Völzer, H. 2006. Illustrating the impossibility of crash-tolerant consensus in asynchronous systems. Oper. Syst. Rev. 40, 2, 105--109.
[49]
Gafni, E. 1998. Round-by-round fault detectors: Unifying synchrony and asynchrony (extended abstract). In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC). 143--152.
[50]
Garg, V. K. and Mitchell, J. R. 1998a. Distributed predicate detection in a faulty environment. In Proceedings of the 18th IEEE International Conference on Distributed Computing Systems (ICDCS98).
[51]
Garg, V. K. and Mitchell, J. R. 1998b. Implementable failure detectors in asynchronous systems. In Proceedings of the 18th Conference on Foundations of Software Technology and Theoretical Computer Science. Lecture Notes in Computer Science, vol. 1530. Springer.
[52]
Gärtner, F. C. and Kloppenburg, S. 2000. Consistent detection of global predicates under a weak fault assumption. In Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems (SRDS'00). IEEE Computer Society Press, 94--103.
[53]
Gärtner, F. C. and Pleisch, S. 2001. (Im)Possibilities of predicate detection in crash-affected systems. In Proceedings of the 5th Workshop on Self-Stabilizing Systems (WSS'01). Lecture Notes in Computer Science, vol. 2194. Springer, 98--113.
[54]
Gärtner, F. C. and Pleisch, S. 2002. Failure detection sequencers: Necessary and sufficient information about failures to solve predicate detection. In Proceedings of the 16th International Symposium on Distributed Computing (DISC'02), D. Malkhi, Ed., Lecture Notes in Computer Science, vol. 2508. Springer, 280--294.
[55]
Guerraoui, R. 2000. Indulgent algorithms. In Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC'00). ACM Press, New York, 289--298.
[56]
Guerraoui, R. 2002. Non-Blocking atomic commitment in asynchronous systems with failure detectors. Distrib. Comput. 15, 1, 17--25.
[57]
Guerraoui, R., Herlihy, M., Kouznetsov, P., Lynch, N. A., and Newport, C. C. 2007. On the weakest failure detector ever. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC). 235--243.
[58]
Guerraoui, R., Hurfin, M., Mostéfaoui, A., Oliveira, R., Raynal, M., and Schiper, A. 1999. Consensus in asynchronous distributed systems: A concise guided tour. In Advances in Distributed Systems, S. Krakowiak and S. K. Shrivastava, Eds. Lecture Notes in Computer Science, vol. 1752. Springer, 33--47.
[59]
Guerraoui, R., Kapalka, M., and Kouznetsov, P. 2008. The weakest failure detectors to boost obstruction-freedom. Distrib. Comput. 20, 6, 415--433.
[60]
Guerraoui, R. and Kouznetsov, P. 2008a. Failure detectors as type boosters. Distrib. Comput. 20, 5, 343--358.
[61]
Guerraoui, R. and Kuznetsov, P. 2008b. The gap in circumventing the impossibility of consensus. J. Comput. Syst. Sci. 74, 5, 823--830.
[62]
Guerraoui, R. and Schiper, A. 1996. “Gamma-accurate” failure detectors. In Proceedings of the Distributed Algorithms 10th International Workshop (WDAG'96), Ö. Babaoglu and K. Marzullo, Eds. Lecture Notes in Computer Science, vol. 1151. Springer, 269--286.
[63]
Guerraoui, R. and Schiper, A. 1997. Genuine atomic multicast. In Proceedings of the 11th International Workshop on Distributed Algorithms (WDAG'97). Lecture Notes in Computer Science, vol. 1320. Springer, 141--154.
[64]
Hadzilacos, V. 1984. Issues of fault tolerance in concurrent computations. Tech. rep. TR11-84, Harvard University.
[65]
Hadzilacos, V. and Toueg, S. 1994. A modular approach to fault-tolerant broadcasts and related problems. Tech. rep. TR94-1425, Computer Science Department. Cornell University.
[66]
Haeberlen, A., Kouznetsov, P., and Druschel, P. 2007. Peerreview: Practical accountability for distributed systems. In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP). 175--188.
[67]
Herlihy, M. and Shavit, N. 1999. The topological structure of asynchronous computability. J. ACM 46, 6, 858--923.
[68]
Herlihy, M. and Wing, J. M. 1990. Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12, 3, 463--492.
[69]
Hermant, J. and Le Lann, G. 2002. Fast asynchronous uniform consensus in real-time distributed systems. IEEE Trans. Comput. 51, 8, 931--944.
[70]
Hermant, J.-F. and Widder, J. 2005. Implementing reliable distributed real-time systems with the theta-model. In Proceedings of the 9th International Conference on Principles of Distributed Systems (OPODIS'05).
[71]
Hurfin, M., Mostéfaoui, A., and Raynal, M. 1998. Consensus in asynchronous systems where processes can crash and recover. In Proceedings of the 17th IEEE Symposium on Reliable Distributed Systems (SRDS'98). IEEE Computer Society Press, 280--286.
[72]
Hurfin, M. and Raynal, M. 1999. A simple and fast asynchronous consensus protocol based on a weak failure detector. Distrib. Comput. 12, 4, 209--223.
[73]
Hutle, M. and Widder, J. 2005. On the possibility and the impossibility of message-driven self-stabilizing failure detection. In Proceedings of the Self Stablizing systems, 7th International Symposium, (SSS'05). T. Herman and S. Tixeuil, Eds. Lecture Notes in Computer Science, Vol. 3764. Springer, 153--170.
[74]
Israeli, A. and Li, M. 1993. Bounded time-stamps. Distrib. Comput. 6, 4, 205--209.
[75]
Jayanti, P. and Toueg, S. 2008. Every problem has a weakest failure detector. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC). 75--84.
[76]
Kihlstrom, K. P., Moser, L. E., and Melliar-Smith, P. M. 2003. Byzantine fault detectors for solving consensus. Comput. J. 46, 1.
[77]
Lamport, L. 1978. Time, clocks and the ordering of events in a distributed system. Comm. ACM 21, 7, 558--565.
[78]
Lamport, L. 1998. The part-time parliament. ACM Trans. Comput. Syst. 16, 2, 133--169.
[79]
Lamport, L., Shostak, R., and Pease, M. 1982. The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4, 3, 382--401.
[80]
Larrea, M., Fernández, A., and Arévalo, S. 2000a. Optimal implementation of the weakest failure detector for solving consensus. In Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems (SRDS'00). IEEE Computer Society Press.
[81]
Larrea, M., Fernández, A., and Arvalo, S. 2000b. Eventually consistent failure detectors. Tech. rep., Universidad Pública de Navarra, Spain. April.
[82]
Lo, W.-K. and Hadzilacos, V. 1994. Using failure detectors to solve consensus in asynchronous shared-memory systems (extended abstract). In Proceedings of the 8th International Workshop on Distributed Algorithms (WDAG'94), G. Tel and P. M. B. Vitányi, Eds. Lecture Notes in Computer Science, vol. 857. Springer, 280--295.
[83]
Long, D. D. E., Carroll, J. L., and Park, C. J. 1991. A study of the reliability of Internet sites. In Proceedings of the 10th IEEE Symposium on Reliable Distributed Systems (SRDS'91). 177--186.
[84]
Malkhi, D. and Reiter, M. 1997. Unreliable intrusion detection in distributed computations. In Proceedings of the 10th Computer Security Foundations Workshop (CSFW97). 116--124.
[85]
Matsui, H., Inoue, M., Masuzawa, T., and Fujiwara, H. 2000. Fault-tolerant and self-stabilizing protocols using an unreliable failure detector. IEICE Trans. E83-D, 10, 1831--1840.
[86]
Mittal, N., Freiling, F. C., Venkatesan, S., and Penso, L. D. 2005. Efficient reduction for wait-free termination detection in a crash-prone distributed system. In Proceedings of the International Symposium on Distributed Computing (DISC). 93--107.
[87]
Mostéfaoui, A., Raynal, M., and Travers, C. 2006. Exploring Gafni's reduction land: From mega to wait-free adaptive (2p-{p/k})-renaming via k-set agreement. In Proceedings of the International Symposium on Distributed Computing (DISC). 1--15.
[88]
Neiger, G. 1995. Failure detectors and the wait-free hierarchy. In Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing (PODC'95). 100--109.
[89]
Oki, B. and Liskov, B. 1988. Viewstamped replication: A general primary copy method to support highly available distributed systems. In Proceedings of the 7th Annual ACM Symposium on Principles of Distributed Computing (PODC'88). 8--17.
[90]
Oliveira, R., Guerraoui, R., and Schiper, A. 1997. Consensus in the crash-recover model. Tech. rep. TR-97/239, EPFL -- Départment d'Informatique, Lausanne, Switzerland.
[91]
Paxson, V. and Adams, A. 2002. Experiences with NIMI. In Proceedings of the Symposium on Applications and the Internet.
[92]
Pedone, F. and Schiper, A. 1999. Generic broadcast. In Proceedings of the 13th International Symposium on Distributed Computing (DISC'99).
[93]
Powell, D. 1992. Failure mode assumptions and assumption coverage. In Proceedings of the 22nd Annual International Symposium on Fault-Tolerant Computing (FTCS '92). D. K. Pradhan, Ed. IEEE Computer Society Press, 386--395.
[94]
Raynal, M. 2002. Consensus in synchronous systems: A concise guided tour. In Proceedings of the Pacific Rim International Symposium on Dependable Computing (PRDC'00). IEEE Computer Society, 221.
[95]
Raynal, M. 2005. A short introduction to failure detectors for asynchronous distributed systems. SIGACT News 36, 1, 53--70.
[96]
Raynal, M. and Travers, C. 2006. In search of the holy grail: Looking for the weakest failure detector for wait-free set agreement. In Proceedings of the International Conference on Principles of Distributed Systems (OPODIS). 3--19.
[97]
Sabel, L. S. and Marzullo, K. 1995. Election vs. consensus in asynchronous systems. Tech. rep. TR95-1488, Computer Science Department, Cornell University. February.
[98]
Saks, M. E. and Zaharoglou, F. 2000. Wait-Free k-set agreement is impossible: The topology of public knowledge. SIAM J. Comput. 29, 5, 1449--1483.
[99]
Schiper, A. 1997a. Early consensus in an asynchronous system with a weak failure detector. Distrib. Comput. 10, 3, 149--157.
[100]
Schiper, A. 1997b. Erratum: Early consensus in an asynchronous system with a weak failure detector. Distrib. Comput. 10, 198.
[101]
Schlichting, R. D. and Schneider, F. B. 1983. Fail stop processors: An approach to designing fault-tolerant computing systems. ACM Trans. Comput. Syst. 1, 3, 222--238.
[102]
Schneider, F. B. 1990. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv. 22, 4, 299--319.
[103]
Schneider, F. B. 1993. What good are models and what models are good? In Distributed Systems, 2nd Ed., S. Mullender, Ed. Addison-Wesley, Reading, MA, Chapter 2, 17--26.
[104]
Sergent, N., Défago, X., and Schiper, A. 1999. Failure detectors: Implementation issues and impact on consensus performance. Tech. rep. SSC/1999/019, École Polytechnique Fédérale de Lausanne, Switzerland.
[105]
Tanenbaum, A. S. 1996. Computer Networks., 3rd Ed. Pren-tice-Hall, Englewood Cliffs, NJ.
[106]
Turek, J. and Shasha, D. 1992. The many faces of consensus in distributed systems. IEEE Comput. 25, 6, 8--17.
[107]
Vitányi, P. and Awerbuch, B. 1986. Atomic shared register access by asynchronous hardware. In Proceedings of the 27th Symposium on Foundations of Computer Science. 233--246.
[108]
Völzer, H. 2004. Randomization versus synchronization in distributed systems. In Proceedings 31st International Colloquium on Automata, Languages, and Programming (ICALP 2004). Lecture Notes in Computer Science, vol. 3142. Springer, 1214--1226.
[109]
Völzer, H. 2005. On conspiracies and hyperfairness in distributed computing. In Proceedings of the 19th International Symposium on Distributed Computing, (DISC'05). Lecture Notes in Computer Science vol. 3724. Springer, 33--47.
[110]
Zielinski, P. 2007. Automatic classification of eventual failure detectors. In Proceedings of the International Symposium on Distributed Computing (DISC). 465--479.
[111]
Zielinski, P. 2008. Anti-omega: The weakest failure detector for set agreement. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC). 55--64.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 43, Issue 2
January 2011
276 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/1883612
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 February 2011
Accepted: 01 July 2009
Revised: 01 May 2009
Received: 01 March 2007
Published in CSUR Volume 43, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Distributed system
  2. agreement problem
  3. atomic commit
  4. consensus
  5. fault tolerance
  6. liveness
  7. message passing
  8. safety
  9. synchrony

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)7
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An in-depth and insightful exploration of failure detection in distributed systemsComputer Networks10.1016/j.comnet.2024.110432247(110432)Online publication date: Jun-2024
  • (2024)Liveness and latency of Byzantine state-machine replicationDistributed Computing10.1007/s00446-024-00466-437:2(177-205)Online publication date: 1-Jun-2024
  • (2022)Simple Majority Consensus in Networks with Unreliable CommunicationEntropy10.3390/e2403033324:3(333)Online publication date: 25-Feb-2022
  • (2022)Database Consistency ModelsEncyclopedia of Big Data Technologies10.1007/978-3-319-63962-8_203-2(1-12)Online publication date: 24-May-2022
  • (2021)Failure Detectors of Strong S and Perfect P Classes for Time Synchronous Hierarchical Distributed SystemsResearch Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing10.4018/978-1-7998-5339-8.ch064(1317-1343)Online publication date: 2021
  • (2021)Protocol transformation for transiently powered wireless sensor networksProceedings of the 36th Annual ACM Symposium on Applied Computing10.1145/3412841.3441985(1112-1121)Online publication date: 22-Mar-2021
  • (2021)Empirical Characterization of User Reports about Cloud Failures2021 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)10.1109/ACSOS52086.2021.00039(158-163)Online publication date: Sep-2021
  • (2021)Consensus in anonymous asynchronous systems with crash-recovery and omission failuresComputing10.1007/s00607-021-01023-8103:12(2811-2837)Online publication date: 1-Dec-2021
  • (2020)FireLedgerProceedings of the VLDB Endowment10.14778/3397230.339724613:9(1525-1539)Online publication date: 26-Jun-2020
  • (2020)Fine-grained Analysis on Fast Implementations of Distributed Multi-writer Atomic RegistersProceedings of the 39th Symposium on Principles of Distributed Computing10.1145/3382734.3405698(200-209)Online publication date: 31-Jul-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media