Hello, I had a crash on a server using GFS-6.1 with kernel 2.6.9-11.ELsmp, i am using GFS with an AOE SAN drive. I am not sure if the problem is with AOE SAN or with GFS would be great to tell me so i can redirect the bug report to the CORAID people. So i have first in the logs some weird stuff about sataide (i am not sure if the SAN is using that) : Sep 30 17:43:20 srv kernel: e send einval to 2 Sep 30 17:43:20 srv kernel: sataide send einval to 2 Sep 30 17:43:20 srv last message repeated 38 times Sep 30 17:43:20 srv kernel: sataide unlock ff050383 no id Sep 30 17:43:20 srv kernel: 231834 id 0 -1,3 1 Sep 30 17:43:20 srv kernel: 7814 qc 2,59f30e -1,5 id ffbe0378 sts 0 0 Sep 30 17:43:20 srv kernel: 19531 lk 5,59f30e id 0 -1,3 0 Sep 30 17:43:20 srv kernel: 4189 lk 2,2ed6bc id 0 -1,3 10001 Sep 30 17:43:20 srv kernel: 7814 qc 5,231834 -1,3 id 5dc0124 sts 0 0 Sep 30 17:43:20 srv kernel: 7814 qc 5,59f30e -1,3 id 27b00cf sts 0 0 Sep 30 17:43:20 srv kernel: 4189 lk 5,2ed6bc id 0 -1,3 1 Sep 30 17:43:20 srv kernel: 7814 qc 2,2ed6bc -1,3 id 1c0202 sts 0 0 Sep 30 17:43:20 srv kernel: 4189 lk 2,2903b3 id 0 -1,3 10001 Sep 30 17:43:20 srv kernel: 7814 qc 5,2ed6bc -1,3 id 227032a sts 0 0 Sep 30 17:43:20 srv kernel: 4189 lk 5,2903b3 id 0 -1,3 1 Sep 30 17:43:20 srv kernel: 7814 qc 2,2903b3 -1,3 id 23c036d sts 0 0 Sep 30 17:43:20 srv kernel: 4189 lk 2,2ba987 id 0 -1,3 10001 Sep 30 17:43:20 srv kernel: 4189 lk 5,2ba987 id 0 -1,3 1 Sep 30 17:43:20 srv kernel: 7814 qc 2,2ba987 -1,3 id 3ab033c sts 0 0 Sep 30 17:43:20 srv kernel: 7814 qc 5,2903b3 -1,3 id 1c80004 sts 0 0 Sep 30 17:43:20 srv kernel: 4189 lk 2,2ce731 id 0 -1,3 10001 Sep 30 17:43:20 srv kernel: 10052 lk 2,500e75 id 0 -1,5 0 Sep 30 17:43:20 srv kernel: 4189 lk 5,2ce731 id 0 -1,3 1 Sep 30 17:43:20 srv kernel: 7814 qc 5,2ba987 -1,3 id 1f003a sts 0 0 Sep 30 17:43:20 srv kernel: 7814 qc 2,2ce731 -1,3 id ff74033d sts 0 0 Sep 30 17:43:20 srv kernel: 19531 lk 5,500e74 id ffd101bd 3,5 805 Sep 30 17:43:20 srv kernel: 7814 qc 5,500e74 3,5 id ffd101bd sts 0 0 Sep 30 17:43:20 srv kernel: 7814 qc 2,500e75 -1,5 id 1660224 sts 0 0 Sep 30 17:43:20 srv kernel: 10052 lk 5,500e75 id 0 -1,3 0 Sep 30 17:43:20 srv kernel: 7814 qc 5,500e75 -1,3 id 3210323 sts 0 0 Sep 30 17:43:20 srv kernel: 29523 lk 2,217df id 0 -1,3 10000 Sep 30 17:43:20 srv kernel: 7814 qc 2,217df -1,3 id 5019b sts 0 0 Sep 30 17:43:20 srv kernel: 29523 lk 5,217df id 0 -1,3 0 Sep 30 17:43:21 srv kernel: 7814 qc 5,217df -1,3 id 2ae0267 sts 0 0 Sep 30 17:43:21 srv kernel: 7814 qc 5,2ce731 -1,3 id 7d0232 sts 0 0 Sep 30 17:43:21 srv kernel: 4189 lk 2,263a00 id 0 -1,3 10001 Sep 30 17:43:21 srv kernel: 7814 qc 2,263a00 -1,3 id 12700c3 sts 0 0 Sep 30 17:43:21 srv kernel: 4189 lk 5,263a00 id 0 -1,3 1 Sep 30 17:43:21 srv kernel: 4189 lk 2,2c446d id 0 -1,3 10001 Sep 30 17:43:21 srv kernel: 7814 qc 5,263a00 -1,3 id ffc00230 sts 0 0 Sep 30 17:43:21 srv kernel: 4189 lk 5,2c446d id 0 -1,3 1 Sep 30 17:43:21 srv kernel: 7814 qc 2,2c446d -1,3 id 34903b4 sts 0 0 Sep 30 17:43:21 srv kernel: 4189 lk 2,1e7a15 id 0 -1,3 10001 Sep 30 17:43:21 srv kernel: 7814 qc 5,2c446d -1,3 id fea901a1 sts 0 0 Sep 30 17:43:21 srv kernel: 4189 lk 5,1e7a15 id 0 -1,3 1 and the crash of GFS just after : Sep 30 17:43:22 srv kernel: lock_dlm: Assertion failed on line 353 of file /usr/src/build/574067-i686/BUILD/smp/src/dlm/lock.c Sep 30 17:43:22 srv kernel: lock_dlm: assertion: "!error" Sep 30 17:43:22 srv kernel: lock_dlm: time = 2509316164 Sep 30 17:43:22 srv kernel: sataide: error=-22 num=5,5bf2f1 lkf=801 flags=84 Sep 30 17:43:22 srv kernel: Sep 30 17:43:22 srv kernel: ------------[ cut here ]------------ Sep 30 17:43:22 srv kernel: kernel BUG at /usr/src/build/574067-i686/BUILD/smp/src/dlm/lock.c:353! Sep 30 17:43:22 srv kernel: invalid operand: 0000 [#1] Sep 30 17:43:22 srv kernel: SMP Sep 30 17:43:22 srv kernel: Modules linked in: lock_dlm(U) aoe(U) gfs(U) lock_harness(U) dlm(U) cman(U) md5 ipv6 joydev button battery ac uhci_hcd ehci_hcd e1000 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod mptscsih mptbase sd_mod scsi_mod Sep 30 17:43:22 srv kernel: CPU: 0 Sep 30 17:43:22 srv kernel: EIP: 0060:[<f8b5360d>] Not tainted VLI Sep 30 17:43:22 srv kernel: EFLAGS: 00010246 (2.6.9-11.ELsmp) Sep 30 17:43:22 srv kernel: EIP is at do_dlm_unlock+0xaa/0xbf [lock_dlm] Sep 30 17:43:22 srv kernel: eax: 00000001 ebx: ffffffea ecx: f63f5f04 edx: f8b5809e Sep 30 17:43:22 srv kernel: esi: cb3ac080 edi: cb3ac080 ebp: f8b1d000 esp: f63f5f00 Sep 30 17:43:23 srv kernel: ds: 007b es: 007b ss: 0068 Sep 30 17:43:23 srv kernel: Process lock_dlm1 (pid: 7818, threadinfo=f63f5000 task=f75bb0b0) Sep 30 17:43:23 srv kernel: Stack: f8b5809e f8b1d000 00000003 f8b538c0 f8ab24f2 00000001 dcbdb3c0 dcbdb3a4 Sep 30 17:43:23 srv kernel: f8aa8852 f8add0c0 d73b9e80 dcbdb3a4 f8add0c0 cb3ac080 f8aa7d4b dcbdb3a4 Sep 30 17:43:23 srv kernel: 00000001 00000001 f8aa7e02 dcbdb3c0 dcbdb3a4 f8aa99af cb3ac080 f7d50e00 Sep 30 17:43:23 srv kernel: Call Trace: Sep 30 17:43:23 srv kernel: [<f8b538c0>] lm_dlm_unlock+0x14/0x1c [lock_dlm] Sep 30 17:43:23 srv kernel: [<f8ab24f2>] gfs_lm_unlock+0x2c/0x42 [gfs] Sep 30 17:43:23 srv kernel: [<f8aa8852>] gfs_glock_drop_th+0xf3/0x12d [gfs] Sep 30 17:43:23 srv kernel: [<f8aa7d4b>] rq_demote+0x7f/0x98 [gfs] Sep 30 17:43:23 srv kernel: [<f8aa7e02>] run_queue+0x5a/0xc1 [gfs] Sep 30 17:43:23 srv kernel: [<f8aa99af>] blocking_cb+0x39/0x7a [gfs] Sep 30 17:43:23 srv kernel: [<f8b5727b>] process_blocking+0x90/0x93 [lock_dlm] Sep 30 17:43:23 srv kernel: [<f8b578c8>] dlm_async+0x28b/0x2ff [lock_dlm] Sep 30 17:43:23 srv kernel: [<c011dc6f>] default_wake_function+0x0/0xc Sep 30 17:43:23 srv kernel: [<c011dc6f>] default_wake_function+0x0/0xc Sep 30 17:43:23 srv kernel: [<f8b5763d>] dlm_async+0x0/0x2ff [lock_dlm] Sep 30 17:43:23 srv kernel: [<c0132e31>] kthread+0x73/0x9b Sep 30 17:43:23 srv kernel: [<c0132dbe>] kthread+0x0/0x9b Sep 30 17:43:23 srv kernel: [<c01041f1>] kernel_thread_helper+0x5/0xb Sep 30 17:43:23 srv kernel: Code: 76 34 8b 06 ff 76 2c ff 76 08 ff 76 04 ff 76 0c 53 ff 70 18 68 a9 81 b5 f8 e8 d6 e3 5c c7 83 c4 34 68 9e 80 b5 f8 e8 c9 e3 5c c7 <0f> 0b 61 01 ef 7f b5 f8 68 a0 80 b5 f8 e8 84 db 5c c7 5b 5e c3 Sep 30 17:43:23 srv kernel: <0>Fatal exception: panic in 5 seconds Cheers, Chmouel. -- Chmouel Boudjnah - Squiz.net - http://www.squiz.net -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster