From: Darrick J. Wong <djwong@xxxxxxxxxx> Commit 1f7c7553489c tried to reduce the memory requirements of phase 6 of repair by redesigning longform_dir2_entry_check without the bplist array. Unfortunately, none of us noticed that the code that rejects a dir block with a bad header now leaks the xfs_buf object because we no longer have a bplist to drop the buffer references. Any time we hold a buffer and decide to move on in the dabno loop, we must release the buffer. The immediate result of this error is that dir_binval complains about the recursive lock count of the buffer when we blow out the directory. However, if the block is reallocated by another thread, repair will deadlock when it tries to get the buffer and cannot take the buffer lock. Found via xfs/113 fuzzing data format directory blocks. For whatever reason this happens much more frequently when su=128k,sw=4, but this applies to everyone equally. While we're at it, make the relse at the bottom of the function run for any remaining buffer reference, even if this isn't a block format directory to avoid leaving a landmine in case we ever add a "goto fix" inside the loop for a non-block directory. Fixes: 1f7c7553489 ("repair: don't duplicate names in phase 6") Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> --- repair/phase6.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/repair/phase6.c b/repair/phase6.c index 0be2c9c9705..48bf57359c5 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -2332,6 +2332,9 @@ longform_dir2_entry_check( fixit++; if (isblock) goto out_fix; + + libxfs_buf_relse(bp); + bp = NULL; continue; } } @@ -2343,6 +2346,7 @@ longform_dir2_entry_check( break; libxfs_buf_relse(bp); + bp = NULL; } fixit |= (*num_illegal != 0) || dir2_is_badino(ino) || *need_dot; @@ -2370,7 +2374,7 @@ longform_dir2_entry_check( } } out_fix: - if (isblock && bp) + if (bp) libxfs_buf_relse(bp); if (!no_modify && (fixit || dotdot_update)) {