On Mon, 2008-02-04 at 08:33 +0100, Alain Moulle wrote: > Hi > > Just for information, I wonder if this behavior is normal : > I have a two-nodes cluster with a quorum disk, and the > CS5 is started on both nodes with a service on each one. > Quorum is working fine when I break the quorum disk format > (with a mkfs on the device !) so that mkqisk -L returns > none. It will keep *trying* to operate. > The behavior is : the CS5 is always working fine as if nothing > has happen. I wonder if it is only due to the heuristics or > if this behavior is simply the std behavior of CS5 with > regard to the quorum disk ? It /should/ throw warnings in the log for all the blocks that are corrupt (and it will probably annoy you ;) ). After 1 cycle, the blocks corresponding to active cluster nodes will have correct/current data on them, and life should continue, but reading the rest of the 16 node blocks should continue throwing warnings: [1533] warning: Error reading node ID block 3 [1533] warning: Error reading node ID block 4 [1533] warning: Error reading node ID block 5 [1533] warning: Error reading node ID block 6 [1533] warning: Error reading node ID block 7 ... [1533] warning: Error reading node ID block 16 (Granted, I used 'dd if=/dev/zero ...' instead mkfs) Qdiskd will not function if you restart it, however, and nodes will be unable to find the quorum disk after a reboot. The header of the quorum disk is not rewritten while qdiskd is running. You'll have to run mkqdisk to fix it - which should also work (but certainly isn't recommended!). This produced the following on the non-master node, but nothing significant on the master node: [1533] info: Node 1 shutdown [1533] debug: Making bid for master [1533] debug: Node 1 is marked master, but is dead. [1533] debug: Node 1 is marked master, but is dead. [1533] debug: Node 1 is marked master, but is dead. [1533] debug: Node 1 is UP [1533] info: Node 1 is the master Looking at the code, if a node dies between the time you clobber qdisk the quorum disk and the time qdiskd on that node writes a new block, qdiskd won't evict that node. Solution: Don't rub salt in cuts. Also, intentionally corrupting your quorum disk could result in the following: https://bugzilla.redhat.com/show_bug.cgi?id=430264 -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster