Re: how is failure detection achieved in Corosync?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Honza,

thank you for your help.
further questions:

On 10/04/2013, at 15:10, Jan Friesse <jfriesse@xxxxxxxxxx> wrote:

> Alejandro Z. Tomsic napsal(a):
>> I would like to know how the process of failure detection is achieved in Corosync (if any). I would like to know about the implementation details, i.e. if its done at physical, virtual machine or at application level. Does Corosync use any known failure detection mechanisms? e.g. [1][2][3][4] or any other. Where can I find this information?
> 
> Totem is based on circulating token and lost of token (so token was not
> delivered for given time) is used as failure detector (so weak
> detector). Corosync also implements (optionally) hearth beating.
> 
> For more informations take a look to:
> https://github.com/corosync/corosync/wiki/Developers#reference-documentation

do you think that there is a modular way to replace this for different failure detectors? I am interested in making a comparison and evaluation of different mechanisms.


> 
> Especially
> http://corosync.github.com/corosync/doc/DAAgarwal.thesis.ps.gz should
> give you informations you need.
> 
> Regards,
>  Honza
> 

Best,

Alejandro


>> 
>> Thank you in advance.
>> 
>> Alejandro
>> 
>> 
>> 
>> 
>> [1] M.Bertier,O.Marin,andP.Sens.Implementation and performance evaluation of an adaptable failure detector. In International Conference on Dependable Systems and Networks (DSN), pages 354–363, June 2002.
>> 
>> [2] W. Chen, S. Toueg, and M. K. Aguilera. On the quality of service of failure detectors. IEEE Transactions on Computers, 51(5):561–580, May 2002.
>> 
>> [3] N. Hayashibara, X. De ́fago, R. Yared, and T. Katayama. The φ accrual failure detector. In IEEE Symposium on Reliable Distributed Systems (SRDS), pages 66–78, Oct. 2004. 
>> 
>> [4] Joshua B. Leners, Hao Wu, Wei-Lun Hung, Marcos K. Aguilera, and Michael Walfish. 2011. Detecting failures in distributed systems with the Falcon spy network. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11). ACM, New York, NY, USA, 279-294. 
>> 
>> 
>> 
>> _______________________________________________
>> discuss mailing list
>> discuss@xxxxxxxxxxxx
>> http://lists.corosync.org/mailman/listinfo/discuss
> 


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss





[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux