> I need to upgrade a Xen cluster using RH servers from 5.3 to 5.4. > > It is a 3 nodes cluster (+qdisk) and 1 luci server for management. > > All VM are in a GFS2 FS in /home/domU/FQDN folders. > > Each FQDN folder contains : > -The xen config file > -the VM FQDN.img file We run an apparently very similar setup, based on CentOS. A difference may well be that /etc/xen contains symbolic links to the VM config files. The files themselves are stored in the same location as the *.img files in the GFS2 file system. > I read about some issues with rgmanager from 5.4 (unable to create VM > config files that were not in /etc/xen). This bug has been fixed in > rgmanager-2.0.52-1.el5_4.1 > Can I safely apply the update without modifying any config made with > 5.3 ? Do I need to tweak the cluster.conf between 5.3 and 5.4 ? As Paras mentioned before, we added in cluster.conf for each VM service use_virsh="0", but we also added max_restarts="0" and restart_expire_time="0". > Also, > Is it better to upgrade the luci server before the nodes ? Did not do that. It is my understanding that you can do everything by modifying cluster.conf directly on the nodes as well. As a precaution, we always increased the version number manually and saved all cluster.conf within a short time interval. > I am also curious about the nodes, what is the best practice : moving > VM , removing the node from the cluster, upgrading and then reboot, > see if everything is fine and go for the next one after that ? Can I > stay with a mix of 5.3 and 5.4 for several days ? After a 5.3 update went all wrong, we had set to set vm autostart="0" and start them from the nodes directly. Luci then showed all VM services as disabled (they were running nevertheless). Our upgrade path went like this: The separate server running Luci was upgraded long time ago (independently). Upgraded each VM by simple "yum update" and shut them down properly from SSH terminal ("poweroff"). Afterward, the corresponding node was upgraded by "yum update" and then rebooted. The VMs on this node were then manually restarted (note that at this point cluster.conf still contained vm autostart="0"). Repeated the procedure with the next node. At this point, we started all VM services from the nodes (not through Luci). Up to here, Luci still considered the VM services as disabled. To re-integrate: Shut them down properly from SSH terminal. Modify each cluster.conf by adding use_virsh="0", max_restarts="0" and restart_expire_time="0". Also changed vm autostart="0" to vm autostart="1". (Do not forget to increase version number at the beginning of cluster.conf.) Then went to Luci and the corresponding VM service was listed afterwards (again) as available. They were not running, though. Afterwards, we used Luci command "Enable" for each VM service. Unfortunately, I don't recall whether the VM was started automatically at this point or if we had to restart it separately from Luci. Anyway, the communication Luci <-> cluster worked again. Even live migration (as ordered through Luci) worked flawlessly. Our update 5.3 -> 5.4 was done about a week ago, so it seemed to me as if the present state of CentOS 5.4 is stable and performing well. Final remarks: I understood from various sources that it seems to be advisable to have nodes and VMs running at the same OS level, at least at the same kernel version. Luci page "Storage" still fails. Seems to be a timeout problem. We came to consider Luci as a nice tool, but found it very reassuring to know how to perform the corresponding actions also directly from terminal on the nodes (i. e. by "xm ..." commands). Regards, Wolf -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster