[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Checkpointing protocols in distributed systems with mobile hosts: A performance analysis

  • Workshop on Fault-Tolerant Parallel and Distributed Systems Dimiter Avresky, Boston University David B. Kaeli, Notheastern University
  • Conference paper
  • First Online:
Parallel and Distributed Processing (IPPS 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1388))

Included in the following conference series:

  • 76 Accesses

Abstract

Checkpointing distributed applications involving mobile hosts is an important task to reduce the rollback during a recovery from a failure and to manage voluntary disconnections. In this paper we show the basic characteristics a checkpointing protocol needs to work with mobile hosts, namely, reduction of the number of checkpoints, the use of incremental checkpointing and consistent global checkpoint built on the fly. Previous points must be implemented by using as small control information as possible and ensuring little rollback. A comparative analysis of the performance of some interesting communication-induced checkpointing protocols, adapted to a mobile setting, is presented. The analysis has been carried out by using discrete event simulation and several models have been considered for the hosts mobility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Acharya, A. and Badrinath, B. R., Checkpointing Distributed Application on Mobile Computers, Proc. 3-th International Conference on Parallel and Distributed Information Systems, 1994.

    Google Scholar 

  2. Acharya, A. and Badrinath, B. R., Delivering Multicast Messages in Network with Mobile Hosts, Proc. 13-th International Conference on Distributed Computing Systems, 1993.

    Google Scholar 

  3. Alagar, S. and Venkatesan, S., Causal Ordering in Distributed Mobile Systems, IEEE Trans. on Computers, 46(3): 353–361, 1997.

    Article  Google Scholar 

  4. Alonso, R. and Korth, H., Database Systems Issues in Nomadic Computing, Proc. ACM SIGMOD International Conference on Management of Data, 1993.

    Google Scholar 

  5. Badrinath, B. R., Acharya, A. and Imielinsky, T., Structuring Distributed Algorithms for Mobile Hosts, Proc. 14-th International Conference on Distributed Computing Systems, 1994.

    Google Scholar 

  6. Baldoni, R., Quaglia, F. and Fornara, P., An Index-Based Checkpointing Algorithm for Autonomous Distributed Systems, Proc. 16-th IEEE Int. Symposium on Reliable Distributed Systems, 1997.

    Google Scholar 

  7. Briatico, D., Ciuffoletti, A. and Simoncini, L., A Distributed Domino-Effect Free Recovery Algorithm, in Proc. IEEE Int. Symposium on Reliability Distributed Software and Database, 1984.

    Google Scholar 

  8. Chandy, K.M. and Lamport, L., Distributed Snapshots: Determining Global States of Distributed Systems, ACM Transactions on Computer Systems, 3(1): 63–75, 1985.

    Article  Google Scholar 

  9. Elnozahy, E. N., Johnson, D. B. and Wang, Y. M., A Survey of Rollback-Recovery Protocols in Message-Passing Systems, Technical Report CMU-CS-96-181, Carnegie-Mellon University, 1996.

    Google Scholar 

  10. Imielinsky, T. and Badrinath, B. R., Wireless Computing, Communications of the ACM, 37(10): 19–27, 1994.

    Google Scholar 

  11. Koo, R. and Toueg, S., Checkpointing and Rollback-Recovery for Distributed Systems, IEEE Transactions on Software Engineering, 13(1): 23–31, 1987.

    Google Scholar 

  12. Lamport, L., Time, Clocks and the Ordering of Events in a Distributed System, Communications of the ACM, 21(7): 558–565, 1978.

    Article  Google Scholar 

  13. Prakash, R. and Singhal, M., A Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems, IEEE Transactions on Parallel and Distributed Systems, 7(10): 1035–1048, 1996.

    Article  Google Scholar 

  14. Quaglia, F., Baldoni, R. and Ciciani, B., A Checkpointing-Recovery Scheme For Domino Free Distributed Systems, Proc. 2-nd Workshop on Fault Tolerant Parallel and Distributed Systems, 1997.

    Google Scholar 

  15. Randell, B., System structure for software fault tolerance, IEEE Transactions on Software Engineering, SE1(2):220–232, 1975.

    Google Scholar 

  16. Russell, D.L., State Restoration in Systems of Communicating Processes, IEEE Transactions on Software Engineering, SE6(2): 183–194, 1980.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Rolim

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Quaglia, F., Ciciani, B., Baldoni, R. (1998). Checkpointing protocols in distributed systems with mobile hosts: A performance analysis. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_739

Download citation

  • DOI: https://doi.org/10.1007/3-540-64359-1_739

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64359-3

  • Online ISBN: 978-3-540-69756-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics