On 02/12/2011 05:51 AM, Kit Gerrits wrote: > > Digimer, > > Did you ever get a reply from anyone? > > If what you say is true, failure of one of our HSRP(HA) switches/routers > might break the cluster. > (if they don't share multicast menberships) > > I would guess that multicast groups originate in the cluster, not the > switch. > In that case, if the switch has been rebooted, the cluster needs to > re-create the multicast groups on the switch. > > I would guess that the cluster itself needs to check if the switch is > properly handling multicast. > (subscribe to its own group and check if the packets are being handles > correctly) > > This should provide an insight into clustering/multicast: > http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note0918 > 6a008059a9df.shtml > > > Regards, > > Kit Hi Kit, I did not, and thank you for replying. So the frequent multicast breakdowns, given that it's fairly rare for switches to reset, is probably in the periodic checks done by the switches. I wonder then if corosync, for whatever reasons, doesn't or isn't able to answer the requests (quickly enough). Perhaps the process takes too much time? Corosync will, by default, decare a ring dead after ~3s. More to think about, and I appreciate that link. Thanks. :) -- Digimer E-Mail: digimer@xxxxxxxxxxx AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster