On 10/13/06, Matteo Catanese <m.catanese@xxxxxxxxxxxxx> wrote:
Hi all, i had a perfectly working 2-node cluster. I saw kernel security updates and cluster bugfix update, so i waited 2 weeks and decided, today, to do the updates I disabled my cluster service (oracle) , patched both machines and rebooted After reboot i had: [root@lvzbe1 kernel]# clustat Could not connect to cluster service and a bunch of Oct 13 13:51:55 lvzbe2 ccsd[3381]: Unable to connect to cluster infrastructure after 3840 seconds. Oct 13 13:52:26 lvzbe2 ccsd[3381]: Unable to connect to cluster infrastructure after 3870 seconds. Oct 13 13:52:56 lvzbe2 ccsd[3381]: Unable to connect to cluster infrastructure after 3900 seconds. Oct 13 13:53:26 lvzbe2 ccsd[3381]: Unable to connect to cluster infrastructure after 3930 seconds. Cluster DIED. I did investigations and i discovered that someone _forgot_ to compile dlm-smp and cman-smp for the latest redhat kernel. this is the "old" kernel: [root@lvzbe1 kernel]# cd /lib/modules/2.6.9-42.0.2.ELsmp/kernel/ [root@lvzbe1 kernel]# ls -la total 44 drwxr-xr-x 10 root root 4096 Sep 4 10:17 . drwxr-xr-x 3 root root 4096 Oct 13 12:56 .. drwxr-xr-x 3 root root 4096 Sep 4 10:17 arch drwxr-xr-x 2 root root 4096 Oct 13 12:56 cluster drwxr-xr-x 2 root root 4096 Sep 4 10:17 crypto drwxr-xr-x 29 root root 4096 Sep 4 10:17 drivers drwxr-xr-x 22 root root 4096 Sep 4 10:17 fs drwxr-xr-x 3 root root 4096 Sep 4 10:17 lib drwxr-xr-x 13 root root 4096 Sep 4 10:17 net drwxr-xr-x 10 root root 4096 Sep 4 10:17 sound [root@lvzbe1 kernel]# and this is the "new" one: root@lvzbe1 kernel]# cd /lib/modules/2.6.9-42.0.3.ELsmp/kernel/ [root@lvzbe1 kernel]# ls -la total 36 drwxr-xr-x 9 root root 4096 Oct 13 12:20 . drwxr-xr-x 3 root root 4096 Oct 13 12:31 .. drwxr-xr-x 3 root root 4096 Oct 13 12:20 arch drwxr-xr-x 2 root root 4096 Oct 13 12:20 crypto drwxr-xr-x 29 root root 4096 Oct 13 12:20 drivers drwxr-xr-x 22 root root 4096 Oct 13 12:20 fs drwxr-xr-x 3 root root 4096 Oct 13 12:20 lib drwxr-xr-x 13 root root 4096 Oct 13 12:20 net drwxr-xr-x 10 root root 4096 Oct 13 12:20 sound [root@lvzbe1 kernel]# As you can see, the latest kernel does not have the "cluster" directory. This is the latest cman: [root@lvzbe1 kernel]# rpm -qil cman-kernel-smp-2.6.9-45.5 Name : cman-kernel-smp Relocations: (not relocatable) Version : 2.6.9 Vendor: Red Hat, Inc. Release : 45.5 Build Date: Fri 18 Aug 2006 07:05:34 PM CEST Install Date: Fri 13 Oct 2006 12:56:36 PM CEST Build Host: hs20- bc1-3.build.redhat.com Group : System Environment/Kernel Source RPM: cman- kernel-2.6.9-45.5.src.rpm Size : 340198 License: GPL Signature : DSA/SHA1, Tue 22 Aug 2006 09:51:57 PM CEST, Key ID 219180cddb42a60e Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Summary : cman-kernel-smp - The Cluster Manager kernel smp modules Description : cman-kernel-smp - The Cluster Manager kernel smp modules /lib/modules/2.6.9-42.0.2.ELsmp/kernel/cluster /lib/modules/2.6.9-42.0.2.ELsmp/kernel/cluster/cman.ko /lib/modules/2.6.9-42.0.2.ELsmp/kernel/cluster/cman.symvers [root@lvzbe1 kernel]# and this is the latest dlm: rpm -qil dlm-kernel-smp-2.6.9-44.2 Name : dlm-kernel-smp Relocations: (not relocatable) Version : 2.6.9 Vendor: Red Hat, Inc. Release : 44.2 Build Date: Tue 26 Sep 2006 10:49:24 PM CEST Install Date: Fri 13 Oct 2006 12:20:35 PM CEST Build Host: hs20- bc2-3.build.redhat.com Group : System Environment/Kernel Source RPM: dlm- kernel-2.6.9-44.2.src.rpm Size : 329858 License: GPL Signature : DSA/SHA1, Thu 28 Sep 2006 09:44:31 PM CEST, Key ID 219180cddb42a60e Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Summary : dlm-kernel-smp - The Distributed Lock Manager kernel modules. Description : dlm-kernel-smp - The Distributed Lock Manager kernel-smp modules. /lib/modules/2.6.9-42.0.2.ELsmp/kernel/cluster/dlm.ko /lib/modules/2.6.9-42.0.2.ELsmp/kernel/cluster/dlm.symvers Luckily this is not (yet) a production system, and i REALLY hope i did something wrong, even if im sure i did not. Can i download cman-kernel-src.rpm and dlm-kernel.src.rpm and compile myself, while waiting for answers from you ? Matteo -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster
The cluster packages are kernel specific and lag behind normal kernel updates. Not sure if they release cluster updates outside the update cycle though, I haven't been using them for more than two updates. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster