On 1/9/2012 9:52 AM, Alan Brown wrote: > On 09/01/12 02:38, Digimer wrote: > >> Technically yes, practically no. Or rather, not without a lot of >> testing first. > > This is "rather a shame" > > I have a similar requirement (EL5 -> EL6 with GFS) > Well the cluster stack itself (openais/cman/gfs/rgmanager -> corosync/cman/gfs2/rgmanager) is capable of handling the upgrade in a compatible mode. *BUT* (yes there are tons of those) in time, while performing different upgrade scenarios/tests, we come to the conclusion that it is a lot more complicated for any user (even expert/advanced ones) to perform a safe upgrade than rebuilding the cluster from scratch (*) given that setup/config/etc are known from the old cluster. >> There may be some other things you need to do as well. Please be sure >> to do proper testing and, if you have the budget, hire Red Hat to advise >> on this process. Also, please report back your results. It would help me >> help others in the same boat later. :) > > RH's advice to use is to "Big Bang" it. It´s not much of an advice, as RH does not officially support this upgrade method. > > The last such transition (EL4 to EL5) was an unmitigated disaster even > with RH onsite to make the change, so we're _very_ wary this time around. > The amount of changes in the cluster software between EL5 and EL6 are a lot less intrusive at system level. I can´t really say for sure for the entire OS, since the upgrade doesn´t involve only RHCS. Fabio (*) The major issues, while upgrading from 5 to 6 are: - GFS1 is not support in EL6. Volumes need to be migrated to GFS2 (and there are several ways to do it, but still needs to be done offline) - cluster.conf cannot be updated automatically during an upgrade or nodes running in mixed mode (some nodes at 5 and others at 6). - some config options, while backward compat should be retained, needs to be changed in very specific sequence, making it really hard to perform an easy upgrade. - but the biggest blocker of all are all the resources (driven or not by rgmanager). For example, apache2 config in EL5 cannot be used out-of-the-box on EL6. So assuming rgmanager is driving apache2, then you would need to setup 2 separate apache2 configs, test them individually, perform migration checks between EL5 and 6... etc. This kind of testing is more time consuming and complex than what you can possibly gain by redoing the cluster from scratch. There are also other resources that are simply unable to deal with this kind of upgrade. Let´s make the example of a db stored on a gfs2 filesystem. DB created in version 1, after a migration to EL6, the DB format is upgraded to internal version 2. Version 2 being incompatible with 1. IF there is a situation where the service needs to failover back to a node running EL5, the DB will be unable to start. Effectively killing the purpose of HA. What you want to notice is that the service compatibility level has nothing to do with cluster itself. Now, when you multiply the amount of possible services, failover scenarios, config changes etc, you will easily come to the conclusion that an upgrade of this proportion is a path to insanity for the administrator. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster