It seems that we can get into situations where certain spike conditions will cause a node to evict another node based on missed writes to the qdisk. The problem is that during these spikes application access to the same storage back end does not seem to be impacted. The SAN in this case is a high end EMC DMX, multipathed, etc... Currently our clusters are set to interval="1" and tko="15" which should allow for at least 15 seconds (a very long time for this type of storage) In looking at ~/cluster/cman/qdisk/main.c it seems like the following is taking place: In quroum_loop {} 1) read everybody else's status (not sure if this includes yourself 2) check for node transitions (write eviction notice if number of heartbeats missed > tko) 3) check local heuristic (if we do not meet requirement remove from qdisk partition and possibly reboot) 4) Find master and/or determine new master, etc... 5) write out our status to qdisk 6) write out our local status (heuristics) 7) cycle ( sleep for defined interval). sleep() measured in seconds so complete cycle = interval + time for steps (1) through (6) Do you think that any delay in steps (1) through (4) could be the problem? From an architectural standpoint wouldn't it be better to have (6) and (7) as a separate thread or daemon? A kernel thread like cman_hbeat for example? Further in the check_transitions procedure case #2 it might be more helpful to clulog what actually caused this to trigger. The current logging is a bit generic. Am I totally off base or does this seem plausible? Thanks, Dan -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster