[PATCH] xfs: fix AGFL allocation dead lock

Wengang Wang <wen.gang.wang@xxxxxxxxxx> · Thu, 30 Mar 2023 13:46:10 -0700

There is deadlock with calltrace on process 10133:

PID 10133 not sceduled for 4403385ms (was on CPU[10])
	#0	context_switch() kernel/sched/core.c:3881
	#1	__schedule() kernel/sched/core.c:5111
	#2	schedule() kernel/sched/core.c:5186
	#3	xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598
	#4	xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641
	#5	xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828
	#6	xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362
	#7	xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029
	#8	__xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067
	#9	xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370
	#10	xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626
	#11	xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605
	#12	xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893
	#13	xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824
	#14	xfs_log_mount_finish() fs/xfs/xfs_log.c:764
	#15	xfs_mountfs() fs/xfs/xfs_mount.c:978
	#16	xfs_fs_fill_super() fs/xfs/xfs_super.c:1908
	#17	mount_bdev() fs/super.c:1417
	#18	xfs_fs_mount() fs/xfs/xfs_super.c:1985
	#19	legacy_get_tree() fs/fs_context.c:647
	#20	vfs_get_tree() fs/super.c:1547
	#21	do_new_mount() fs/namespace.c:2843
	#22	do_mount() fs/namespace.c:3163
	#23	ksys_mount() fs/namespace.c:3372
	#24	__do_sys_mount() fs/namespace.c:3386
	#25	__se_sys_mount() fs/namespace.c:3383
	#26	__x64_sys_mount() fs/namespace.c:3383
	#27	do_syscall_64() arch/x86/entry/common.c:296
	#28	entry_SYSCALL_64() arch/x86/entry/entry_64.S:180

It's waiting xfs_perag.pagb_gen to increase (busy extent clearing happen).
>From the vmcore, it's waiting on AG 1. And the ONLY busy extent for AG 1 is
with the transaction (in xfs_trans.t_busy) for process 10133. That busy extent
is created in a previous EFI with the same transaction. Process 10133 is
waiting, it has no change to commit that that transaction. So busy extent
clearing can't happen and pagb_gen remain unchanged. So dead lock formed.

commit 06058bc40534530e617e5623775c53bb24f032cb disallowed using busy extents
for any path that calls xfs_extent_busy_trim(). That looks over-killing.
For AGFL block allocation, it just use the first extent that satisfies, it won't
try another extent for choose a "better" one. So it's safe to reuse busy extent
for AGFL.

To fix above dead lock, this patch allows reusing busy extent for AGFL.

Signed-off-by: Wengang Wang <wen.gang.wang@xxxxxxxxxx>
---
 fs/xfs/xfs_extent_busy.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
index ef17c1f6db32..f857a5759506 100644
--- a/fs/xfs/xfs_extent_busy.c
+++ b/fs/xfs/xfs_extent_busy.c
@@ -344,6 +344,7 @@ xfs_extent_busy_trim(
 	ASSERT(*len > 0);
 
 	spin_lock(&args->pag->pagb_lock);
+restart:
 	fbno = *bno;
 	flen = *len;
 	rbp = args->pag->pagb_tree.rb_node;
@@ -362,6 +363,20 @@ xfs_extent_busy_trim(
 			continue;
 		}
 
+		/*
+		 * AGFL reserving (metadata) is just using the first-
+		 * fit extent, there is no optimization that tries further
+		 * extents. So it's safe to reuse the busy extent and safe
+		 * to update the busy extent.
+		 * Reuse for AGFL even busy extent being discarded.
+		 */
+		if (args->resv == XFS_AG_RESV_AGFL) {
+			if (!xfs_extent_busy_update_extent(args->mp, args->pag,
+				busyp, fbno, flen, false))
+				goto restart;
+			continue;
+		}
+
 		if (bbno <= fbno) {
 			/* start overlap */
 
-- 
2.21.0 (Apple Git-122.2)