On 02/11/2009 21:29, Gianluca Cecchi wrote:
On Mon, Nov 2, 2009 at 6:25 PM, David Teigland <teigland@xxxxxxxxxx <mailto:teigland@xxxxxxxxxx>> wrote: The out-of-memory should be fixed in 5.4: https://bugzilla.redhat.com/show_bug.cgi?id=508829 The fix for dlm_send spinning is not released yet: https://bugzilla.redhat.com/show_bug.cgi?id=521093 Dave Thank you so much for the feedback. So I have to expect this freeze and possible downtime...... also if my real nmodes a safer method could be this one below for my two nodes + quorum disk cluster? 1) shutdown and restart in single user mode of the passive node So now the cluster is composed of only one node in 5.3 without loss of service, at the moment 2) start network and update the passive node (as in steps of the first mail) 3) reboot in single user mode of the just updated node, and test correct funcionality (without cluster) 4) shutdown again of the just updated node 5) shutdown of the active node --- NOW we have downtime (planned) 6) startup of the updated node, now in 5.4 (and with 508829 bug corrected) This node should form the cluster with 2 votes, itself and the quorum, correct? 7) IDEA: make a dummy update to the config on this new running node, only incrementing version number by one, so that after, when the other node comes up, it gets the config.... Does it make sense or no need/no problems for this when the second node will join? 8) power on in single user mode of the node still in 5.3 9) start network on it and update system as in steps 2) 10) reboot the just updated node and let it start in single user mode to test its functionality (without cluster enabled) 11) reboot again and let it normally join the cluster Expected result: correct join of the cluster, correct? 12) Test a relocation of the service ----- NOW another little downtime, but to be sure that in case of need we get relocation without problems I'm going to test this tomorrow (here half past ten pm now) after restore of initial situation with both in 5.3, so if there are any comments, they are welcome..
FWIW, I just updated one of my DRBD+GFS clusters from 5.3 (and early 5.3 at that) to 5.4 with a rolling re-start, and it "just worked". It's a 2-node cluster with a shared GFS root, and I updated it, rebuilt the initrd, rebooted one node, which came up and rejoined find, then rebooted the other. No service downtime.
Gordan -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster