A file collapse stress test workload reproduces collapse failures mid-operation due to changes in the inode fork extent count across extent shift cycles. xfs_collapse_file_space() currently calls xfs_bmap_shift_extents() to shift one extent at a time per transaction. The extent index is used to track the next extent to shift after each iteration. A concurrent fsx and fsstress workload reproduces a scenario where the extent count changes during this sequence, causing the 'current_ext' index to become inaccurate and possibly skip shifting an extent. The likely result of this behavior is the subsequent shift attempt will not find a hole in the area of the skipped extent and fail, leaving the file in a partially collapsed state. This occurs because the ilock is released and acquired across each transaction and each individual extent shift. Tracepoint output shows that once the ilock is released after an extent shift, a pending blocking writeback (e.g., sync) can acquire the lock and proceed before the next extent is shifted down. If the writeback converts part of a delayed allocation earlier in the file, for example, it can insert a new extent into the map. Tracing confirms a call to xfs_bmap_add_extent_delay_real() in this particular instance. To prevent this scenario, hold the ilock across the entire extent shift loop in xfs_collapse_file_space(). Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> --- fs/xfs/xfs_bmap_util.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 2f1e30d..96eb97b 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -1474,6 +1474,8 @@ xfs_collapse_file_space( if (error) return error; + xfs_ilock(ip, XFS_ILOCK_EXCL); + while (!error && !done) { tp = xfs_trans_alloc(mp, XFS_TRANS_DIOSTRAT); /* @@ -1489,7 +1491,6 @@ xfs_collapse_file_space( break; } - xfs_ilock(ip, XFS_ILOCK_EXCL); error = xfs_trans_reserve_quota(tp, mp, ip->i_udquot, ip->i_gdquot, ip->i_pdquot, XFS_DIOSTRAT_SPACE_RES(mp, 0), 0, @@ -1517,9 +1518,9 @@ xfs_collapse_file_space( goto out; error = xfs_trans_commit(tp, XFS_TRANS_RELEASE_LOG_RES); - xfs_iunlock(ip, XFS_ILOCK_EXCL); } + xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; out: -- 1.8.3.1 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs