Hi all,
Just had a crash on our 3 node RedHat Enterprise Linux 5.4 cluster
that looks a lot like
https://bugzilla.redhat.com/show_bug.cgi?id=520720. We're running
kernel 2.6.18-164.11.1.el5. Here is the traceback:
[2010-03-03 19:18:27]Unable to handle kernel NULL pointer dereference at
0000000000000078 RIP: ^M
[2010-03-03 19:18:27] [<ffffffff88572766>] :gfs2:revoke_lo_add+0x1a/0x32^M
[2010-03-03 19:18:27]PGD 0 ^M
[2010-03-03 19:18:27]Oops: 0002 [1] SMP ^M
[2010-03-03 19:18:27]last sysfs file:
/devices/pci0000:00/0000:00:06.0/0000:0b:00.0/0000:0c:09.0/0000:0d:00.0/host0/rport-0:0-4/target0:0:3/0:0:3:2/state^M
[2010-03-03 19:18:27]CPU 13 ^M
[2010-03-03 19:18:27]Modules linked in: ipt_MASQUERADE iptable_nat
ip_nat bridge autofs4 hidp l2cap bluetooth lock_dlm gfs2(U) dlm configfs
lockd sunrpc ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink
xt_tcpudp ipt_REJECT iptable_filter ip_tables arpt_mangle
arptable_filter arp_tables x_tables dm_round_robin dm_multipath scsi_dh
video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi
acpi_memhotplug ac parport_pc lp parport sg serio_raw pcspkr bnx2 ide_cd
hpilo cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot
dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc ata_piix
libata shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd^M
[2010-03-03 19:18:28]Pid: 792, comm: kswapd0 Tainted: G
2.6.18-164.11.1.el5 #1^M
[2010-03-03 19:18:28]RIP: 0010:[<ffffffff88572766>]
[<ffffffff88572766>] :gfs2:revoke_lo_add+0x1a/0x32^M
[2010-03-03 19:18:28]RSP: 0018:ffff81082ef61ae8 EFLAGS: 00010282^M
[2010-03-03 19:18:28]RAX: 0000000000000000 RBX: ffff81072a4b3610 RCX:
ffff8103a31d78a0^M
[2010-03-03 19:18:28]RDX: ffff8107768b63f0 RSI: ffff8108172e17c0 RDI:
ffff8108172e1000^M
[2010-03-03 19:18:28]RBP: ffff8107768b63d0 R08: ffff81082fc7ef06 R09:
ffff81082ef61b20^M
[2010-03-03 19:18:28]R10: ffff810119d7cc18 R11: ffffffff8857274c R12:
ffff8108172e1000^M
[2010-03-03 19:18:29]R13: 0000000000000000 R14: ffff81072a4b3610 R15:
ffff8108172e1000^M
[2010-03-03 19:18:29]FS: 0000000000000000(0000)
GS:ffff81082fc7edc0(0000) knlGS:0000000000000000^M
[2010-03-03 19:18:29]CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b^M
[2010-03-03 19:18:29]CR2: 0000000000000078 CR3: 00000006cdbe7000 CR4:
00000000000006e0^M
[2010-03-03 19:18:29]Process kswapd0 (pid: 792, threadinfo
ffff81082ef60000, task ffff81082f5407e0)^M
[2010-03-03 19:18:29]Stack: ffffffff88573bfb 000000002ef61e10
ffff81072a4b3610 ffff81010962e028^M
[2010-03-03 19:18:29] 0000000000000000 0000000000000000 ffffffff88574ee0
000000000000000e^M
[2010-03-03 19:18:29] ffff81010962e028 0000000000413ac8 ffff81082ef61cf0
ffff8108172e1000^M
[2010-03-03 19:18:29]Call Trace:^M
[2010-03-03 19:18:29] [<ffffffff88573bfb>]
:gfs2:gfs2_remove_from_journal+0x11a/0x12c^M
[2010-03-03 19:18:29] [<ffffffff88574ee0>]
:gfs2:gfs2_invalidatepage+0xea/0x151^M
[2010-03-03 19:18:29] [<ffffffff88574c45>]
:gfs2:gfs2_writepage_common+0x95/0xb1^M
[2010-03-03 19:18:29] [<ffffffff88575129>]
:gfs2:gfs2_jdata_writepage+0x2a/0xa0^M
[2010-03-03 19:18:29] [<ffffffff800ca21c>]
shrink_inactive_list+0x3fd/0x8d8^M
[2010-03-03 19:18:29] [<ffffffff8004819b>] __pagevec_release+0x19/0x22^M
[2010-03-03 19:18:29] [<ffffffff800c9cfe>] shrink_active_list+0x4b4/0x4c4^M
[2010-03-03 19:18:30] [<ffffffff80013007>] shrink_zone+0xf7/0x15d^M
[2010-03-03 19:18:30] [<ffffffff80057e41>] kswapd+0x323/0x46c^M
[2010-03-03 19:18:30] [<ffffffff800a00b7>]
autoremove_wake_function+0x0/0x2e^M
[2010-03-03 19:18:30] [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4^M
[2010-03-03 19:18:30] [<ffffffff80057b1e>] kswapd+0x0/0x46c^M
[2010-03-03 19:18:30] [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4^M
[2010-03-03 19:18:30] [<ffffffff80032950>] kthread+0xfe/0x132^M
[2010-03-03 19:18:30] [<ffffffff8009cd34>] request_module+0x0/0x14d^M
[2010-03-03 19:18:30] [<ffffffff8005dfb1>] child_rip+0xa/0x11^M
[2010-03-03 19:18:30] [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4^M
[2010-03-03 19:18:30] [<ffffffff80032852>] kthread+0x0/0x132^M
[2010-03-03 19:18:30] [<ffffffff8005dfa7>] child_rip+0x0/0x11^M
[2010-03-03 19:18:30]^M
[2010-03-03 19:18:30]^M
[2010-03-03 19:18:30]Code: ff 40 78 c7 40 50 01 00 00 00 ff 87 94 07 00
00 48 89 d7 e9 ^M
[2010-03-03 19:18:30]RIP [<ffffffff88572766>]
:gfs2:revoke_lo_add+0x1a/0x32^M
[2010-03-03 19:18:30] RSP <ffff81082ef61ae8>^M
[2010-03-03 19:18:30]CR2: 0000000000000078^M
[2010-03-03 19:18:30] <0>Kernel panic - not syncing: Fatal exception^M
Since we're already running the latest 5.4 kernel, it's not clear what
might be going on, here. There is a note in the bug about making sure
the gfs2-kmod from 5.2 isn't still around. What version of gfs2-kmod is
the old version, or should I just remove all instances of gfs2-kmod?
-- scooter
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster