how is failure detection achieved in Corosync?

"Alejandro Z. Tomsic" <aletomsic@xxxxxxxxx> · Wed, 10 Apr 2013 12:36:15 +0200

I would like to know how the process of failure detection is achieved in Corosync (if any). I would like to know about the implementation details, i.e. if its done at physical, virtual machine or at application level. Does Corosync use any known failure detection mechanisms? e.g. [1][2][3][4] or any other. Where can I find this information?
Thank you in advance.

Alejandro

[1] M.Bertier,O.Marin,andP.Sens.Implementation and performance evaluation of an adaptable failure detector. In International Conference on Dependable Systems and Networks (DSN), pages 354–363, June 2002.

[2] W. Chen, S. Toueg, and M. K. Aguilera. On the quality of service of failure detectors. IEEE Transactions on Computers, 51(5):561–580, May 2002.

[3] N. Hayashibara, X. De ́fago, R. Yared, and T. Katayama. The φ accrual failure detector. In IEEE Symposium on Reliable Distributed Systems (SRDS), pages 66–78, Oct. 2004. 

[4] Joshua B. Leners, Hao Wu, Wei-Lun Hung, Marcos K. Aguilera, and Michael Walfish. 2011. Detecting failures in distributed systems with the Falcon spy network. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11). ACM, New York, NY, USA, 279-294. 
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss