Greetings,
We are hitting an issue with XFS printing messages like
“XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)”
and stack trace like in [1]. Eventually, hung-task panic kicks in with
stack traces like [2].
We are running kernel 3.8.13. I see that in http://oss.sgi.com/archives/xfs/2012-01/msg00341.html
a similar issue has been discussed, but no code changes followed comparing to
what we have in 3.8.13.
Any suggestion on how to move forward with this problem? For example, does
this memory has to be really allocated with kmalloc (i.e., physically
continuous) or vmalloc can be used?
Thanks,
Alex.
[1]
[109626.075483]
nfsd D
0000000000000002 0 20042 2
0x00000000
[109626.075483] ffff88026ac3ef58 0000000000000046 ffff88031fffbd80
ffff88026ac40000
[109626.075483] ffff88026ac3ffd8 ffff88026ac3ffd8 ffff88026ac3ffd8
0000000000013f40
[109626.075483] ffff88030e542e80 ffff88026ac40000 ffff88030e58c000
ffff88026ac3ef90
[109626.075483] Call Trace:
[109626.075483] [<ffffffff816ec509>] schedule+0x29/0x70
[109626.075483] [<ffffffff816eabd0>]
schedule_timeout+0x130/0x250
[109626.075483] [<ffffffff8106a340>] ? cascade+0xa0/0xa0
[109626.075483] [<ffffffff816ec8a2>]
io_schedule_timeout+0xa2/0x100
[109626.075483] [<ffffffff81185311>] ?
__kmalloc+0x181/0x190
[109626.075483] [<ffffffff81153bc0>]
congestion_wait+0x80/0x120
[109626.075483] [<ffffffff81185311>] ?
__kmalloc+0x181/0x190
[109626.075483] [<ffffffff8107fc10>] ?
add_wait_queue+0x60/0x60
[109626.075483] [<ffffffff81185260>] ?
__kmalloc+0xd0/0x190
[109626.075483] [<ffffffffa07631fc>] ? kmem_alloc+0x5c/0xe0
[xfs]
[109626.075483] [<ffffffffa0763330>] ? kmem_realloc+0x30/0x70
[xfs]
[109626.075483] [<ffffffffa0795f0d>] ?
xfs_iext_realloc_indirect+0x3d/0x60 [xfs]
[109626.075483] [<ffffffffa0795f6f>] ?
xfs_iext_irec_new+0x3f/0x180 [xfs]
[109626.075483] [<ffffffffa0796229>] ?
xfs_iext_add_indirect_multi+0x179/0x2b0 [xfs]
[109626.075483] [<ffffffffa079662e>] ? xfs_iext_add+0xce/0x290
[xfs]
[109626.075483] [<ffffffff81097c33>] ?
update_curr+0x143/0x1f0
[109626.075483] [<ffffffffa0796842>] ?
xfs_iext_insert+0x52/0x100 [xfs]
[109626.075483] [<ffffffffa0771b43>] ?
xfs_bmap_add_extent_hole_delay+0xd3/0x6a0 [xfs]
[109626.075483] [<ffffffffa0771b43>] ?
xfs_bmap_add_extent_hole_delay+0xd3/0x6a0 [xfs]
[109626.075483] [<ffffffffa07950d7>] ?
xfs_iext_bno_to_ext+0xf7/0x160 [xfs]
[109626.075483] [<ffffffffa0772389>] ?
xfs_bmapi_reserve_delalloc+0x279/0x2a0 [xfs]
[109626.075483] [<ffffffffa07793b2>] ?
xfs_bmapi_delay+0x122/0x270 [xfs]
[109626.075483] [<ffffffffa0758703>] ?
xfs_iomap_write_delay+0x173/0x320 [xfs]
[109626.075483] [<ffffffffa077909c>] ?
xfs_bmapi_read+0xfc/0x2f0 [xfs]
[109626.075483] [<ffffffff8135d8f3>] ?
call_rwsem_down_write_failed+0x13/0x20
[109626.075483] [<ffffffffa0745b40>] ?
__xfs_get_blocks+0x280/0x550 [xfs]
[109626.075483] [<ffffffffa0745e41>] ? xfs_get_blocks+0x11/0x20
[xfs]
[109626.075483] [<ffffffff811cf77e>] ?
__block_write_begin+0x1ae/0x4e0
[109626.075483] [<ffffffffa0745e30>] ?
xfs_get_blocks_direct+0x20/0x20 [xfs]
[109626.075483] [<ffffffff81135fff>] ?
grab_cache_page_write_begin+0x8f/0xf0
[109626.075483] [<ffffffffa074509f>] ?
xfs_vm_write_begin+0x5f/0xe0 [xfs]
[109626.075483] [<ffffffff8113552a>] ?
generic_perform_write+0xca/0x210
[109626.075483] [<ffffffff811356cd>] ?
generic_file_buffered_write+0x5d/0x90
[109626.075483] [<ffffffffa07502d5>] ?
xfs_file_buffered_aio_write+0x115/0x1c0 [xfs]
[109626.075483] [<ffffffff816159f4>] ?
ip_finish_output+0x224/0x3b0
[109626.075483] [<ffffffffa075047c>] ?
xfs_file_aio_write+0xfc/0x1b0 [xfs]
[109626.075483] [<ffffffffa0750380>] ?
xfs_file_buffered_aio_write+0x1c0/0x1c0 [xfs]
[109626.075483] [<ffffffff8119b8c3>] ?
do_sync_readv_writev+0xa3/0xe0
[109626.075483] [<ffffffff8119bb8d>] ?
do_readv_writev+0xcd/0x1d0
[109626.075483] [<ffffffff810877e0>] ?
set_groups+0x40/0x60
[109626.075483] [<ffffffffa01be6b0>] ? nfsd_setuser+0x120/0x2b0
[nfsd]
[109626.075483] [<ffffffff8119bccc>] ?
vfs_writev+0x3c/0x50
[109626.075483] [<ffffffffa01b7dd2>] ?
nfsd_vfs_write.isra.12+0x92/0x350 [nfsd]
[109626.075483] [<ffffffff8119a6cb>] ?
dentry_open+0x6b/0xd0
[109626.075483] [<ffffffffa01ba679>] ? nfsd_write+0xf9/0x110
[nfsd]
[109626.075483] [<ffffffffa01c4dd1>] ?
nfsd3_proc_write+0xb1/0x140 [nfsd]
[109626.075483] [<ffffffffa01b3d62>] ?
nfsd_dispatch+0x102/0x270 [nfsd]
[109626.075483] [<ffffffffa012bb48>] ?
svc_process_common+0x328/0x5e0 [sunrpc]
[109626.075483] [<ffffffffa012c153>] ? svc_process+0x103/0x160
[sunrpc]
[109626.075483] [<ffffffffa01b372f>] ? nfsd+0xbf/0x130
[nfsd]
[109626.075483] [<ffffffffa01b3670>] ? nfsd_destroy+0x80/0x80
[nfsd]
[109626.075483] [<ffffffff8107f050>] ? kthread+0xc0/0xd0
[109626.075483] [<ffffffff8107ef90>] ?
flush_kthread_worker+0xb0/0xb0
[109626.075483] [<ffffffff816f61ec>] ?
ret_from_fork+0x7c/0xb0
[109626.075483] [<ffffffff8107ef90>] ?
flush_kthread_worker+0xb0/0xb0
[2]
[87303.976119] INFO: task nfsd:5684 blocked for more than 180
seconds.
[87303.976976] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[87303.978012]
nfsd D
0000000000000003 0
5684 2 0x00000000
[87303.978017] ffff8802506d37e8 0000000000000046 ffff880200000000
ffff880307ce3c9c
[87303.978020] ffff8802506d3fd8 ffff8802506d3fd8 ffff8802506d3fd8
0000000000013f40
[87303.978023] ffff88030e5445c0 ffff8802ca9945c0 ffff8802506d37c8
ffff88009dd8fd60
[87303.978026] Call Trace:
[87303.978036] [<ffffffff816ec509>] schedule+0x29/0x70
[87303.978039] [<ffffffff816ec7be>]
schedule_preempt_disabled+0xe/0x10
[87303.978042] [<ffffffff816eb437>]
__mutex_lock_slowpath+0xd7/0x150
[87303.978045] [<ffffffff816eb04a>] mutex_lock+0x2a/0x50
[87303.978076] [<ffffffffa075a227>]
xfs_file_buffered_aio_write+0x67/0x1c0 [xfs]
[87303.978089] [<ffffffffa075a47c>]
xfs_file_aio_write+0xfc/0x1b0 [xfs]
[87303.978101] [<ffffffffa075a380>] ?
xfs_file_buffered_aio_write+0x1c0/0x1c0 [xfs]
[87303.978105] [<ffffffff8119b8c3>]
do_sync_readv_writev+0xa3/0xe0
[87303.978109] [<ffffffff8119bb8d>]
do_readv_writev+0xcd/0x1d0
[87303.978112] [<ffffffff811afe51>] ?
prepend_path+0xf1/0x1e0
[87303.978115] [<ffffffff811856fc>] ?
kmem_cache_alloc_trace+0x11c/0x140
[87303.978119] [<ffffffff8130c425>] ?
aa_alloc_task_context+0x35/0x50
[87303.978122] [<ffffffff8119bccc>] vfs_writev+0x3c/0x50
[87303.978145] [<ffffffffa0266dd2>]
nfsd_vfs_write.isra.12+0x92/0x350 [nfsd]
[87303.978149] [<ffffffff816ed43e>] ?
_raw_spin_lock+0xe/0x20
[87303.978159] [<ffffffffa0284ba4>] ?
find_confirmed_client.isra.58+0x144/0x1a0 [nfsd]
[87303.978167] [<ffffffffa0284d48>] ?
nfsd4_lookup_stateid+0xc8/0x120 [nfsd]
[87303.978174] [<ffffffffa0269623>] nfsd_write+0xa3/0x110
[nfsd]
[87303.978182] [<ffffffffa027794c>] nfsd4_write+0x1cc/0x250
[nfsd]
[87303.978189] [<ffffffffa027746c>]
nfsd4_proc_compound+0x5ac/0x7a0 [nfsd]
[87303.978197] [<ffffffffa0262d62>] nfsd_dispatch+0x102/0x270
[nfsd]
[87303.978214] [<ffffffffa01f3b48>]
svc_process_common+0x328/0x5e0 [sunrpc]
[87303.978225] [<ffffffffa01f4153>] svc_process+0x103/0x160
[sunrpc]
[87303.978232] [<ffffffffa026272f>] nfsd+0xbf/0x130
[nfsd]
[87303.978238] [<ffffffffa0262670>] ? nfsd_destroy+0x80/0x80
[nfsd]
[87303.978243] [<ffffffff8107f050>] kthread+0xc0/0xd0
[87303.978246] [<ffffffff8107ef90>] ?
flush_kthread_worker+0xb0/0xb0
[87303.978250] [<ffffffff816f61ec>]
ret_from_fork+0x7c/0xb0
[87303.978253] [<ffffffff8107ef90>] ?
flush_kthread_worker+0xb0/0xb0 |
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs