[BUG] xfs deadlock due to agi and agf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi list

We are running out online tasks on centos 4.18.0-147.5.1.el8 kernel,
and we have encountered a deadlock case many times. And it looks like
a deadlock around agf and agi. 

We get a scene which has 106 backtrace of D tasks and there are mainly
6 kinds of backtrace. Most of them were waiting for AGI's lock and others
are AGF.

Since the correct order is AGI->AGF, so the backtrace trying to require
AGI's lock is more suspect, right ?

Thanks a million for any help.

AGI
================================================
[<0>] down+0x3b/0x50
[<0>] xfs_buf_lock+0x33/0x100 [xfs]
[<0>] xfs_buf_find.isra.27+0x3fa/0x630 [xfs]
[<0>] xfs_buf_get_map+0x40/0x2a0 [xfs]
[<0>] xfs_buf_read_map+0x28/0x1b0 [xfs]
[<0>] xfs_trans_read_buf_map+0xc6/0x340 [xfs]
[<0>] xfs_read_agi+0x95/0x140 [xfs]
[<0>] xfs_iunlink+0x4d/0x140 [xfs]
[<0>] xfs_remove+0x1e2/0x2d0 [xfs]
[<0>] xfs_vn_unlink+0x55/0xa0 [xfs]
[<0>] vfs_unlink+0x109/0x1a0
[<0>] do_unlinkat+0x225/0x310
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
// 14/106
// xfs_droplink(), namely, add the inode to orphan list of agi
// can only be one agi, no agf, so it shouldn't be the initiator, right ?

[<0>] down+0x3b/0x50
[<0>] xfs_buf_lock+0x33/0x100 [xfs]
[<0>] xfs_buf_find.isra.27+0x3fa/0x630 [xfs]
[<0>] xfs_buf_get_map+0x40/0x2a0 [xfs]
[<0>] xfs_buf_read_map+0x28/0x1b0 [xfs]
[<0>] xfs_trans_read_buf_map+0xc6/0x340 [xfs]
[<0>] xfs_read_agi+0x95/0x140 [xfs]
[<0>] xfs_iunlink_remove+0x5b/0x430 [xfs]
[<0>] xfs_ifree+0x4a/0x160 [xfs]
[<0>] xfs_inactive_ifree+0xa1/0x1b0 [xfs]
[<0>] xfs_inactive+0x9e/0x140 [xfs]
[<0>] xfs_fs_destroy_inode+0xa8/0x1c0 [xfs]
[<0>] do_unlinkat+0x256/0x310
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
// 6/106
// remove the inode from orphan list of agi
// this is after truncate which need to lock agf, but in differernt transaction
// so it's also safe...

[<0>] down+0x3b/0x50
[<0>] xfs_buf_lock+0x33/0x100 [xfs]
[<0>] xfs_buf_find.isra.27+0x3fa/0x630 [xfs]
[<0>] xfs_buf_get_map+0x40/0x2a0 [xfs]
[<0>] xfs_buf_read_map+0x28/0x1b0 [xfs]
[<0>] xfs_trans_read_buf_map+0xc6/0x340 [xfs]
[<0>] xfs_read_agi+0x95/0x140 [xfs]
[<0>] xfs_ialloc_read_agi+0x2f/0xd0 [xfs]
[<0>] xfs_dialloc+0xe7/0x2b0 [xfs]
[<0>] xfs_ialloc+0x6b/0x5a0 [xfs]
[<0>] xfs_dir_ialloc+0x6c/0x220 [xfs]
[<0>] xfs_create+0x227/0x5d0 [xfs]
[<0>] xfs_generic_create+0x237/0x2e0 [xfs]
[<0>] vfs_mkdir+0x102/0x1b0
[<0>] do_mkdirat+0x7d/0xf0
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
// 73/106
// Can it try to get agi when holding agf ? especially, when
// xfs_ialloc_ag_alloc() doesn't get any blocks but left AGF's
// locked buf in transaction


AGF
=============================================================
[<0>] down+0x3b/0x50
[<0>] xfs_buf_lock+0x33/0x100 [xfs]
[<0>] xfs_buf_find.isra.27+0x3fa/0x630 [xfs]
[<0>] xfs_buf_get_map+0x40/0x2a0 [xfs]
[<0>] xfs_buf_read_map+0x28/0x1b0 [xfs]
[<0>] xfs_trans_read_buf_map+0xc6/0x340 [xfs]
[<0>] xfs_read_agf+0x8e/0x120 [xfs]
[<0>] xfs_alloc_read_agf+0x3e/0x1b0 [xfs]
[<0>] xfs_alloc_fix_freelist+0x3bc/0x470 [xfs]
[<0>] xfs_free_extent_fix_freelist+0x64/0xb0 [xfs]
[<0>] __xfs_free_extent+0x5e/0x150 [xfs]
[<0>] xfs_trans_free_extent+0x42/0x120 [xfs]
[<0>] xfs_extent_free_finish_item+0x26/0x40 [xfs]
[<0>] xfs_defer_finish_noroll+0x180/0x4d0 [xfs]
[<0>] xfs_defer_finish+0x13/0x70 [xfs]
[<0>] xfs_itruncate_extents_flags+0xde/0x250 [xfs]
[<0>] xfs_inactive_truncate+0xa3/0xf0 [xfs]
[<0>] xfs_inactive+0xb5/0x140 [xfs]
[<0>] xfs_fs_destroy_inode+0xa8/0x1c0 [xfs]
[<0>] do_unlinkat+0x256/0x310
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
// Just truncate the data extent, there has been multi-AGF order handling in it
// order of agf ? 'xfs: fix multi-AG deadlock in xfs_bunmapi' has been introduced

[<0>] down+0x3b/0x50
[<0>] xfs_buf_lock+0x33/0x100 [xfs]
[<0>] xfs_buf_find.isra.27+0x3fa/0x630 [xfs]
[<0>] xfs_buf_get_map+0x40/0x2a0 [xfs]
[<0>] xfs_buf_read_map+0x28/0x1b0 [xfs]
[<0>] xfs_trans_read_buf_map+0xc6/0x340 [xfs]
[<0>] xfs_read_agf+0x8e/0x120 [xfs]
[<0>] xfs_alloc_read_agf+0x3e/0x1b0 [xfs]
[<0>] xfs_alloc_fix_freelist+0x3bc/0x470 [xfs]
[<0>] xfs_free_extent_fix_freelist+0x64/0xb0 [xfs]
[<0>] __xfs_free_extent+0x5e/0x150 [xfs]
[<0>] xfs_trans_free_extent+0x42/0x120 [xfs]
[<0>] xfs_extent_free_finish_item+0x26/0x40 [xfs]
[<0>] xfs_defer_finish_noroll+0x180/0x4d0 [xfs]
[<0>] __xfs_trans_commit+0x149/0x350 [xfs]
[<0>] xfs_remove+0x21d/0x2d0 [xfs]
[<0>] xfs_vn_unlink+0x55/0xa0 [xfs]
[<0>] vfs_rmdir+0x7a/0x140
[<0>] do_rmdir+0x17d/0x1e0
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
//It seems that the defer operation comes from xfs_dir_removename()
//It should hold agi

[<0>] down+0x3b/0x50
[<0>] xfs_buf_lock+0x33/0x100 [xfs]
[<0>] xfs_buf_find.isra.27+0x3fa/0x630 [xfs]
[<0>] xfs_buf_get_map+0x40/0x2a0 [xfs]
[<0>] xfs_buf_read_map+0x28/0x1b0 [xfs]
[<0>] xfs_trans_read_buf_map+0xc6/0x340 [xfs]
[<0>] xfs_read_agf+0x8e/0x120 [xfs]
[<0>] xfs_alloc_read_agf+0x3e/0x1b0 [xfs]
[<0>] xfs_alloc_fix_freelist+0x3bc/0x470 [xfs]
[<0>] xfs_alloc_vextent+0x27b/0x560 [xfs]
[<0>] __xfs_inobt_alloc_block.isra.8+0xc0/0x120 [xfs]
[<0>] __xfs_btree_split.isra.38+0xdd/0x580 [xfs]
[<0>] xfs_btree_split+0x4b/0x100 [xfs]
[<0>] xfs_btree_make_block_unfull+0x192/0x1c0 [xfs]
[<0>] xfs_btree_insrec+0x41d/0x4c0 [xfs]
[<0>] xfs_btree_insert+0xcb/0x230 [xfs]
[<0>] xfs_difree_finobt+0xb9/0x1e0 [xfs]
[<0>] xfs_difree+0x11e/0x190 [xfs]
[<0>] xfs_ifree+0x82/0x160 [xfs]
[<0>] xfs_inactive_ifree+0xa1/0x1b0 [xfs]
[<0>] xfs_inactive+0x9e/0x140 [xfs]
[<0>] xfs_fs_destroy_inode+0xa8/0x1c0 [xfs]
[<0>] do_unlinkat+0x256/0x310
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
// Hold agi and operate agf


Thanks
Jianchao



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux