I have the following panic on two nodes hours apart. Each node Is in a different state ( as in states of the US ). NO I am not running a cluster over a WAN, just two separate clusters in two different locations. Files are written on one cluster and I have a script that does an SCP of the file to the other cluster. Both machines running the latest RHEL4 with the latest GFS updates. This just started happening. Happened twice since Friday morning. Any hints ? What is happening with clvmd here ? What does the global conflict message mean ? -- Andre Oct 1 13:26:47 fs1.fl.apexrad.com kernel: purged 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd mark waiting requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd marked 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd recover event 5 done Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd move flags 0,0,1 ids 2,5,5 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd process held requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd processed 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd resend marked requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd resent 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: clvmd recover event 5 finished Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 total nodes 1 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 rebuild resource directory Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 rebuilt 0 resources Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 recover event 4 done Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 move flags 0,0,1 ids 0,4,4 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 process held requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 processed 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 recover event 4 finished Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 move flags 1,0,0 ids 4,4,4 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 move flags 0,1,0 ids 4,7,4 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 move use event 7 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 recover event 7 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 add node 2 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 total nodes 2 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 rebuild resource directory Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 rebuilt 6 resources Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 purge requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 purged 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 mark waiting requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 marked 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 recover event 7 done Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 move flags 0,0,1 ids 4,7,7 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 process held requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 processed 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 resend marked requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 resent 0 requests Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01 recover event 7 finished Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 245-253 ex 1 own 4158637196, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 254-26c ex 1 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 26d-27b ex 1 own 4158637196, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 27c-28b ex 1 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 28c-29b ex 1 own 4158637196, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 29c-2ac ex 1 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 2ad-2b9 ex 1 own 4158637196, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 2ba-2c7 ex 1 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-ff ex 0 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 c8-2c7 ex 0 own 4158638348, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 200-2c7 ex 0 own 4158638348, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1 ex 0 own 4101191756, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158638828, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-2c7 ex 0 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-fff ex 0 own 4158638348, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 2c8-fff ex 0 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158638348, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-3f ex 0 own 4158636236, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158638348, pid 444u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-fff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 70000-7ffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 80000-8ffff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 90000-9ffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 a0000-affff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 b0000-bffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 c0000-cffff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 d0000-dffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 e0000-effff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 f0000-fffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 100000-10ffff ex 1 own 4101191756, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 110000-11ffff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 120000-12ffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 130000-13ffff ex 1 own 4158638828, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 140000-14ffff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 150000-15ffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 160000-16ffff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 170000-17ffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 180000-18ffff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 190000-19ffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1a0000-1affff ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1b0000-1bffff ex 1 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c0000-1c2aa7 ex 1 own 4158636236, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-ff ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 200-2ff ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c28a8-1c2aa7 ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c26da-1c27d9 ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 44b-54a ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c0ba5-1c0ca4 ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c0780-1c087f ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c12a8-1c13a7 ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c277d-1c287c ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c276a-1c2869 ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c10eb-1c11ea ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c04-1d03 ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 fe00-ffff ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1 ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-fff ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 1c1aa8-1c2aa7 ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-fff ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-3f ex 0 own 4158637196, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 4424 global conflict 0 0-1ff ex 0 own 4158638348, pid 296u Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lock_dlm: Assertion failed on line 428 of file /usr/src/build/765787-i686/BUIL D/gfs-kernel-2.6.9-58/smp/src/dlm/lock.c Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lock_dlm: assertion: "!error" Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lock_dlm: time = 185852977 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: lvol01: num=2,684f0dd err=-22 cur=3 req=5 lkf=44 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Oct 1 13:26:47 fs1.fl.apexrad.com kernel: ------------[ cut here ]------------ Oct 1 13:26:47 fs1.fl.apexrad.com kernel: kernel BUG at /usr/src/build/765787-i686/BUILD/gfs-kernel-2.6.9-58/smp/src/dlm/ lock.c:428! Oct 1 13:26:47 fs1.fl.apexrad.com kernel: invalid operand: 0000 [#1] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: SMP Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Modules linked in: nfs nfsd exportfs lockd nfs_acl autofs4 i2c_dev i2c_core loc k_dlm(U) gfs(U) lock_harness(U) dlm(U) cman(U) md5 ipv6 sunrpc dm_mirror button battery ac uhci_hcd ehci_hcd hw_random e10 00 floppy sg ext3 jbd dm_mod megaraid_mbox megaraid_mm sd_mod scsi_mod Oct 1 13:26:47 fs1.fl.apexrad.com kernel: CPU: 0 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: EIP: 0060:[<f8df7779>] Not tainted VLI Oct 1 13:26:47 fs1.fl.apexrad.com kernel: EFLAGS: 00010246 (2.6.9-42.ELsmp) Oct 1 13:26:47 fs1.fl.apexrad.com kernel: EIP is at do_dlm_lock+0x134/0x14e [lock_dlm] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: eax: 00000001 ebx: ffffffea ecx: d18c5dc0 edx: f8dfc221 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: esi: f8df7798 edi: c387e600 ebp: e194f780 esp: d18c5dbc Oct 1 13:26:47 fs1.fl.apexrad.com kernel: ds: 007b es: 007b ss: 0068 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Process rmdir (pid: 23174, threadinfo=d18c5000 task=f64f0b30) Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Stack: f8dfc221 20202020 32202020 20202020 20202020 34383620 64643066 32200018 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: 20202020 e194f780 00000001 00000003 e194f780 f8df7828 00000005 f8dff940 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: f8919000 f8eba936 00000000 00000001 d16c1dd4 d16c1db8 f8919000 f8eb08fe Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Call Trace: Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8df7828>] lm_dlm_lock+0x49/0x52 [lock_dlm] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eba936>] gfs_lm_lock+0x35/0x4d [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eb08fe>] gfs_glock_xmote_th+0x130/0x172 [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eaffbd>] rq_promote+0xc8/0x147 [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eb01a9>] run_queue+0x91/0xc1 [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eb11b9>] gfs_glock_nq+0xcf/0x116 [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eb18f5>] nq_m_sync+0x44/0x64 [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8eb1a5e>] gfs_glock_nq_m+0x149/0x15d [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<f8ec87d9>] gfs_rmdir+0x6a/0x168 [gfs] Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<c0168a55>] vfs_rmdir+0x1a3/0x1f1 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<c0168b44>] sys_rmdir+0xa1/0xf4 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<c011ae55>] do_page_fault+0x0/0x5c6 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: [<c02d4703>] syscall_call+0x7/0xb Oct 1 13:26:47 fs1.fl.apexrad.com kernel: Code: 26 50 0f bf 45 24 50 53 ff 75 08 ff 75 04 ff 75 0c ff 77 18 68 4c c3 df f 8 e8 32 b1 32 c7 83 c4 38 68 21 c2 df f8 e8 25 b1 32 c7 <0f> 0b ac 01 5e c1 df f8 68 23 c2 df f8 e8 e0 a8 32 c7 83 c4 20 Oct 1 13:26:47 fs1.fl.apexrad.com kernel: <0>Fatal exception: panic in 5 seconds Oct 1 13:49:59 fs1.fl.apexrad.com syslogd 1.4.1: restart. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster