Greetings,
we had a log replay failure due to some errors that the underlying block
device returned:
[49133.801406] XFS (dm-95): metadata I/O error: block 0x270e8c180
("xlog_recover_iodone") error 28 numblks 16
[49133.802495] XFS (dm-95): log mount/recovery failed: error 28
[49133.802644] XFS (dm-95): log mount failed
and then kernel panicked [1].
Looking at the code, when xfs_mountfs() fails, xfs_fs_fill_super() goes and
cleans up and eventually frees the "xfs_mount" structure. But then
xfs_buf_iodone_work() can still be delivered through "xfslogd_workqueue",
which is static and not per-XFS. But this callback has a pointer to
"xfs_mount", and may try to access it as in [1]. Does this analysis sound
correct? Kernel is 3.8.13, but looking at the XFS master branch, it might
have the same issue.
Should we flush this static workqueue before unmounting?
Thanks,
Alex.
[1]
[49133.804546] general protection fault: 0000 [#1] SMP
[49133.808033] xcbc rmd160 crypto_null af_key xfrm_algo scsi_dh cirrus
psmouse ttm drm_kms_helper serio_raw drm i2c_piix4 sysimgblt virtio_balloon
sysfillrect syscopyarea nfsd(OF) kvm nfs_acl auth_rpcgss nfs fscache
microcode lockd mac_hid sunrpc lp parport floppy ixgbevf(OF)
[49133.808033] CPU 2
[49133.808033] Pid: 2907, comm: kworker/2:1H Tainted: GF W O
3.8.13-030813-generic #201305111843 Bochs Bochs
[49133.808033] RIP: 0010:[<ffffffff813582fb>] [<ffffffff813582fb>]
strnlen+0xb/0x30
[49133.808033] RSP: 0018:ffff8801e31c5b08 EFLAGS: 00010086
[49133.808033] RAX: 0000000000000000 RBX: ffffffff81e4e527 RCX:
0000000000000000
[49133.808033] RDX: 640000450008cf9d RSI: ffffffffffffffff RDI:
640000450008cf9d
[49133.808033] RBP: ffff8801e31c5b08 R08: 000000000000ffff R09:
000000000000ffff
[49133.808033] R10: 0000000000000000 R11: 0000000000000ffe R12:
640000450008cf9d
[49133.808033] R13: ffffffff81e4e900 R14: 0000000000000000 R15:
000000000000ffff
[49133.808033] FS: 0000000000000000(0000) GS:ffff88021fd00000(0000)
knlGS:0000000000000000
[49133.808033] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[49133.808033] CR2: 00007fa4a91abd80 CR3: 000000020e783000 CR4:
00000000000006e0
[49133.808033] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[49133.808033] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[49133.808033] Process kworker/2:1H (pid: 2907, threadinfo ffff8801e31c4000,
task ffff8802124f45c0)
[49133.808033] Stack:
[49133.808033] ffff8801e31c5b48 ffffffff81359d8e ffff8801e31c5b28
ffffffff81e4e527
[49133.808033] ffffffffa0636a1e ffff8801e31c5c80 ffffffffa0636a1e
ffffffff81e4e900
[49133.808033] ffff8801e31c5bc8 ffffffff8135af89 ffff8801e31c5bc8
ffffffff8105a4e7
[49133.808033] Call Trace:
[49133.808033] [<ffffffff81359d8e>] string.isra.4+0x3e/0xd0
[49133.808033] [<ffffffff8135af89>] vsnprintf+0x219/0x640
[49133.808033] [<ffffffff8105a4e7>] ? msg_print_text+0xb7/0x1b0
[49133.808033] [<ffffffff8135b471>] vscnprintf+0x11/0x30
[49133.808033] [<ffffffff8105b3b1>] vprintk_emit+0xc1/0x490
[49133.808033] [<ffffffff8105b460>] ? vprintk_emit+0x170/0x490
[49133.808033] [<ffffffff816d5848>] printk+0x61/0x63
[49133.808033] [<ffffffffa05ba261>] __xfs_printk+0x31/0x50 [xfs]
[49133.808033] [<ffffffffa05ba4b3>] xfs_notice+0x53/0x60 [xfs]
[49133.808033] [<ffffffffa05b14a5>] xfs_do_force_shutdown+0xf5/0x180 [xfs]
[49133.808033] [<ffffffffa05f6c38>] ? xlog_recover_iodone+0x48/0x70 [xfs]
[49133.808033] [<ffffffffa05f6c38>] xlog_recover_iodone+0x48/0x70 [xfs]
[49133.808033] [<ffffffffa05a749d>] xfs_buf_iodone_work+0x4d/0xa0 [xfs]
[49133.808033] [<ffffffff81078b81>] process_one_work+0x141/0x490
[49133.808033] [<ffffffff81079b48>] worker_thread+0x168/0x400
[49133.808033] [<ffffffff810799e0>] ? manage_workers+0x120/0x120
[49133.808033] [<ffffffff8107f050>] kthread+0xc0/0xd0
[49133.808033] [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
[49133.808033] [<ffffffff816f61ec>] ret_from_fork+0x7c/0xb0
[49133.808033] [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs