isplist@xxxxxxxxxxxx wrote: >>> What should I be looking for to post here? >>> >>> >> The exact detail of any kernel panic you are seeing .. ALL the text. >> and then the obvious stuff: cluster.conf file, version numbers of all >> cluster software, distribution and where you got them from. >> Copies of things in /proc/cluster are always helpful too, if you can get >> them from any running node (please say which node). >> > > That's a lot of info :). Got some of it at least; > > ccs 1.0.7-0.XOS.1 > cman 1.0.11-0.XOS.1 > cman-kernel 2.6.9-36.0.XOS.1 > cman-kernel 2.6.9-45.8.XOS.1 > cman-kernel 2.6.9-45.15.XOS.1 > fence 1.32.25-1.XOS.1 > lvm2-cluster 2.02.06-7.0.RHEL4.XOS.1 > magma 1.0.6-0.XOS.1 > magma-plugins 1.0.9-0.XOS.1 > piranha 0.8.2-1.XOS.1 > system-config-cluster 1.0.27-1.0.XOS.1 > > And yes, I know I'm running old versions but all of the nodes are running the > same things and it works fine for me, cept for this new problem :). Now, as I > posted this, it does dawn on me that the new node (img62) would have newer > versions of all of the above installed. Would this be the cause? Should I > upgrade all nodes to the latest versions? > > Yes, it could be that the versions are out of step. I'm not sure about what's in each of those versions as I don't recognise the numbers, there were some incompatibilities between very old versions of cman and newer ones. So I strongly recommend upgradeing .. or, at least using the same version on all nodes. > This is the kernel panic from .58 when .62 (img62) tries to join the cluster. > The new node does have an updated cluster.conf and so do all of the other > nodes to reflect the new node joining. All nodes had their hosts file updated > also so that they know about it's IP. > > Nov 13 09:59:32 compdev kernel: klogd 1.4.1, log source = /proc/kmsg started. > Nov 13 10:03:40 compdev kernel: CMAN: node img62.domain.com rejoining > Nov 13 10:03:42 compdev kernel: Unable to handle kernel paging request at > virtual address 008c9689 > Nov 13 10:03:42 compdev kernel: printing eip: > Nov 13 10:03:42 compdev kernel: e09e0d19 > Nov 13 10:03:42 compdev kernel: *pde = 00000000 > Nov 13 10:03:42 compdev kernel: Oops: 0000 [#1] > Nov 13 10:03:42 compdev kernel: Modules linked in: autofs4 dlm(U) cman(U) md5 > ipv6 sunrpc dm_mirror uhci_hcd e100 mii floppy ext3 jbd dm_mod qla2200 qla2xxx > scsi_transport_fc sd_mod scsi_mod > Nov 13 10:03:42 compdev kernel: CPU: 0 > Nov 13 10:03:42 compdev kernel: EIP: 0060:[<e09e0d19>] Not tainted VLI > Nov 13 10:03:42 compdev kernel: EFLAGS: 00010202 (2.6.9-42.0.10.EL.XOS.1) > Nov 13 10:03:42 compdev kernel: EIP is at process_join_request+0x65/0x1ba > [cman] > Nov 13 10:03:42 compdev kernel: eax: 00000000 ebx: 008c9689 ecx: e09f20c0 > edx: dd439000 > Nov 13 10:03:42 compdev kernel: esi: 00006564 edi: 0000003a ebp: dd439f98 > esp: dd439f58 > Nov 13 10:03:42 compdev kernel: ds: 007b es: 007b ss: 0068 > Nov 13 10:03:42 compdev kernel: Process cman_serviced (pid: 2212, > threadinfo=dd439000 task=de793340) > Nov 13 10:03:42 compdev kernel: Stack: 00000000 d6f9c014 0000003e 00000000 > 00000000 00000000 00000000 00000000 > Nov 13 10:03:42 compdev kernel: 95eb1078 0003641b de750ae0 0000003e > d6f9c000 dd439f98 e09de8a3 e09e1125 > Nov 13 10:03:42 compdev kernel: 00000001 00000000 00000000 00070000 > 61666564 06e57ac4 000000d9 de793340 > Nov 13 10:03:42 compdev kernel: Call Trace: > Nov 13 10:03:42 compdev kernel: [<e09de8a3>] serviced+0x0/0x140 [cman] > Nov 13 10:03:42 compdev kernel: [<e09e1125>] process_message+0x32/0x93 [cman] > Nov 13 10:03:42 compdev kernel: [<e09e12a9>] process_messages+0x123/0x13e > Patrick -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster