Hello again, Occasionally, when one of our machines is under memory pressure and an "rm" command is used, it will deadlock. Maybe someone familiar with the code can take a look. In the trace below, it looks like kswapd was shrinking the inode cache, took iprune_mutex, and while calling clear_inode on a list of inodes to dispose, he blocked on an xfs_buf lock for the on-disk inode data. Meanwhile "rm" is trying to read in the inode using xfs_iread, has the xfs_buf locked, and is trying to allocate memory to copy the contents to an in-memory inode. But that allocation makes a trip through shrink icache and blocks on the iprune_mutex held by kswapd. kswapd is calling a path in _xfs_buf_find which is ignoring the XBF_DONT_BLOCK flag, even though that path seems fully prepared to fail if the buffer is locked. The patch is against 2.6.27, but it applies OK to "mainline" from kernel.org. PID: 239 TASK: f7924b60 CPU: 0 COMMAND: "kswapd0" #0 [f58adadc] schedule at c03abd22 #1 [f58adb38] schedule_timeout at c03ac4ec #2 [f58adb80] __down at c03acc9a #3 [f58adba4] down at c015690c #4 [f58adbb4] xfs_buf_lock at f8de2912 not honoring the XBF_BUSY flag #5 [f58adbc0] _xfs_buf_find at f8de2284 #6 [f58adbf4] xfs_buf_get_flags at f8de2357 #7 [f58adc1c] xfs_buf_read_flags at f8de245d XFS_BUF_LOCK|BUF_BUSY ie don't block #8 [f58adc34] xfs_trans_read_buf at f8dd74b5 #9 [f58adc5c] xfs_imap_to_bp at f8dbc575 get the locked xfs_buf for inode #10 [f58adc88] xfs_inotobp at f8dbc6db #11 [f58adcd0] xfs_iunlink_remove at f8dbeb86 #12 [f58add38] xfs_ifree at f8dbf27a #13 [f58add78] xfs_inactive at f8ddb643 #14 [f58addc4] xfs_fs_clear_inode at f8dea765 #15 [f58adde4] clear_inode at c01d4099 #16 [f58addf4] generic_delete_inode at c01d4ecd #17 [f58ade08] generic_drop_inode at c01d50af #18 [f58ade10] iput at c01d5115 #19 [f58ade1c] gridfs_clear_inode at f8e4d67a #20 [f58adefc] balance_pgdat at c019ab1e called shrink_icache/prune_icache/dispose_list #21 [f58adf78] kswapd at c019ad7f #22 [f58adfd0] kthread at c0151a82 #23 [f58adfe4] kernel_thread_helper at c010aa55 PID: 22357 TASK: f41d6480 CPU: 0 COMMAND: "rm" #0 [e980d934] schedule at c03abd22 #1 [e980d990] __mutex_lock_slowpath at c03ac8d1 #2 [e980d9b8] mutex_lock at c03ac78d #3 [e980d9c0] prune_icache at c01d437f #4 [e980d9e8] shrink_icache_memory at c01d4537 #5 [e980d9f0] shrink_slab at c0198e57 #6 [e980da3c] do_try_to_free_pages at c019a698 #7 [e980da74] try_to_free_pages at c019a867 #8 [e980dac4] __alloc_pages_internal at c0194112 #9 [e980db10] allocate_slab at c01b8040 #10 [e980db30] new_slab at c01b8122 #11 [e980db50] __slab_alloc at c01b8769 #12 [e980db70] kmem_cache_alloc at c01b88dd #13 [e980db90] kmem_zone_alloc at f8ddf9e9 #14 [e980dbb4] kmem_zone_zalloc at f8ddfa38 #15 [e980dbc8] xfs_iformat at f8dbca9a #16 [e980dc1c] xfs_iread at f8dbda96 did xfs_itobp to get locked xfs_buf #17 [e980dc50] xfs_iget_core at f8dbba2b #18 [e980dca0] xfs_iget at f8dbbf66 #19 [e980dcd8] xfs_lookup at f8ddb9c1 #20 [e980dd14] xfs_vn_lookup at f8de726b #21 [e980dd34] __lookup_hash at c01c89f3 #22 [e980dd50] lookup_one_len at c01c8b07 Signed-off-by: Peter Watkins <treestem@xxxxxxxxx> --- fs/xfs/linux-2.6/xfs_buf.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c index 8454dee..ba3a11b 100644 --- a/fs/xfs/linux-2.6/xfs_buf.c +++ b/fs/xfs/linux-2.6/xfs_buf.c @@ -539,7 +539,7 @@ found: * spinlock and do a hard attempt on the semaphore. */ if (down_trylock(&bp->b_sema)) { - if (!(flags & XBF_TRYLOCK)) { + if (!(flags & (XBF_TRYLOCK|XBF_DONT_BLOCK))) { /* wait for buffer ownership */ XB_TRACE(bp, "get_lock", 0); xfs_buf_lock(bp); -- 1.6.0.4 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs