Out setup is:
* We are using GFS from cvs stable branch on our 2.6.14.7
cluster. Just updated today to the
newest CVS version. Only had to change the mutex() calls.
* The 4 nodes are running debian sarge;
* The 4 nodes act as NFS-servers for +/- 640 client-nodes
* brocade switch with SGI TP9300 4 controllers (15 TB)
We did a lot of testing an we could not crash the cluster, bonnie/
iozone and other tools/jobs. Now the cluster is in production we
get a lot of nfsd crashed with EIP is at fda_create. We had it with
our previous kernel 2.16.4.4 and with this one and "latest"
CVS stable version. The server still runs ++ the load is high and it
does not respond any more. If we are luckly only one NFS
thread is gone and rest is still up. The rest of the nodes still work.
Have users experienced this kind of problems and maybe have a
solution for this problem?
Regards,
Here is a oops message:
Unable to handle kernel NULL pointer dereference at virtual address
00000038
printing eip:
f89bf999
*pde = 37bff001
*pte = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg
ide_floppy ide_cd cdrom qla2300 qla2xxx_conf qla2xxx firmware_class
siimage piix e1000 gfs lock_harness dm_mod
CPU: 0
EIP: 0060:[<f89bf999>] Tainted: GF VLI
EFLAGS: 00010246 (2.6.14.7-sara1)
EIP is at gfs_create+0xa9/0x1e0 [gfs]
eax: ffffffef ebx: ffffffef ecx: 00000001 edx: 00000000
esi: f296e24c edi: ebf01e18 ebp: ebf01e84 esp: ebf01df8
ds: 007b es: 007b ss: 0068
Process nfsd (pid: 16924, threadinfo=ebf00000 task=ebe84540)
Stack: ebf01e48 f296e24c 00000001 00008180 ebf01e18 00000001 f8cb9000
dd042254
ebf01e18 ebf01e18 00000000 ebe84540 00000001 00000120
00000000 000000c2
00000000 00000001 ebf01e40 ebf01e40 ebf01e48 ebf01e48
df0bd858 ebe84540
Call Trace:
[<c0103e5f>] show_stack+0x7f/0xa0
[<c0104012>] show_registers+0x162/0x1d0
[<c0104224>] die+0xf4/0x180
[<c035f697>] do_page_fault+0x2e7/0x6b2
[<c0103b03>] error_code+0x4f/0x54
[<c016b663>] vfs_create+0x83/0xf0
[<c01b81ce>] nfsd_create_v3+0x40e/0x550
[<c01bed2d>] nfsd3_proc_create+0x11d/0x180
[<c01b2f87>] nfsd_dispatch+0xd7/0x200
[<c0353a96>] svc_process+0x536/0x670
[<c01b2d1d>] nfsd+0x1bd/0x350
[<c010127d>] kernel_thread_helper+0x5/0x18
Code: 24 08 8d 45 c4 89 54 24 0c 89 74 24 04 89 04 24 e8 1d c3 fe ff
85 c0 89 c3 0f 84 2e 01 00 00 83 f8 ef 0f 85 13 01 00 00 8b 55 14
<80> 7a 38 00 0f 88 06 01 00 00 89 7c 24 0c 31 c0 8d 55 c4 89 44
--
Bas van der Vlies
basv@xxxxxxx
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster