On Thu, 2008-05-15 at 16:21 +0200, Alain Moulle wrote: > Hi Lon > > Thans again, but that's strange because in the man , the recommended > values are : > intervall="1" tko="10" and so we have a result < 21s which is the > default value of heart-beat timer, so not a hair above like you > recommened in previous email ... > extract of man qddisk : > > interval="1" > This is the frequency of read/write cycles, in seconds. > > tko="10" > This is the number of cycles a node must miss in order to be > declared dead. > > ? > > So the better values to match with the default heart-beat timeout of 21s should > be : > > interval="2" and tko="11" > > right ? Yes, but you don't want to match it. You want qdisk to timeout before CMAN with enough time so that ifthe qdisk master node dies, there is enough time to elect a new master *before* CMAN would normally transition. On RHEL4, the default CMAN timeout is 21 seconds. On RHEL5, it's 5 seconds - which must be tweaked currently using the totem <token ... > parameter. I intend to make qdiskd automatically detect the CMAN death detection time in the near future and automatically configure itself, because this is something users/administrators just *shouldn't* have to deal with... (Does anyone disagree with that? :) ) Anyway, here's a graphical representation as to why qdiskd needs to time out (long) before CMAN: http://people.redhat.com/lhh/cmanvsqdisk.png -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster