On Sun, Mar 20, 2022 at 09:43:49AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Don't spin in an infinite loop trying to reserve blocks -- if we can't > do it after 30 tries, we're racing with a nearly full filesystem, so > just give up. > > Cc: Brian Foster <bfoster@xxxxxxxxxx> > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > --- > fs/xfs/xfs_fsops.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c > index 710e857bb825..615334e4f689 100644 > --- a/fs/xfs/xfs_fsops.c > +++ b/fs/xfs/xfs_fsops.c > @@ -379,6 +379,7 @@ xfs_reserve_blocks( > int64_t fdblks_delta = 0; > uint64_t request; > int64_t free; > + unsigned int tries; > int error = 0; > > /* If inval is null, report current values and return */ > @@ -430,9 +431,16 @@ xfs_reserve_blocks( > * If the request is larger than the current reservation, reserve the > * blocks before we update the reserve counters. Sample m_fdblocks and > * perform a partial reservation if the request exceeds free space. > + * > + * The loop body estimates how many blocks it can request from fdblocks > + * to stash in the reserve pool. This is a classic TOCTOU race since > + * fdblocks updates are not always coordinated via m_sb_lock. We also > + * cannot tell if @free remaining unchanged between iterations is due > + * to an idle system or freed blocks being consumed immediately, so > + * we'll try a finite number of times to satisfy the request. > */ > error = -ENOSPC; > - do { > + for (tries = 0; tries < 30 && error == -ENOSPC; tries++) { > free = percpu_counter_sum(&mp->m_fdblocks) - > xfs_fdblocks_unavailable(mp); > if (free <= 0) > @@ -459,7 +467,7 @@ xfs_reserve_blocks( > spin_unlock(&mp->m_sb_lock); > error = xfs_mod_fdblocks(mp, -fdblks_delta, 0); > spin_lock(&mp->m_sb_lock); > - } while (error == -ENOSPC); > + } So the problem here is that if we don't get all of the reservation, we get none of it, so we try again. This seems like we should simply punch the error back out to userspace and let it retry rather than try a few times and then fail anyway. Either way, userspace has to handle failure to reserve enough blocks. The other alternative here is that we just change the reserve block limit when ENOSPC occurs and allow the xfs_mod_fdblocks() refill code just fill up the pool as space is freed. That would greatly simplify the code - we do a single attempt to reserve the new space, and if it fails we don't care because the reserve space gets refilled before free space is made available again. I think that's preferable to having an arbitrary number of retries like this that's going to end up failing anyway. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx