On Thu, Mar 24, 2022 at 08:11:47AM +1100, Dave Chinner wrote: > On Sun, Mar 20, 2022 at 09:43:49AM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Don't spin in an infinite loop trying to reserve blocks -- if we can't > > do it after 30 tries, we're racing with a nearly full filesystem, so > > just give up. > > > > Cc: Brian Foster <bfoster@xxxxxxxxxx> > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > > --- > > fs/xfs/xfs_fsops.c | 12 ++++++++++-- > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c > > index 710e857bb825..615334e4f689 100644 > > --- a/fs/xfs/xfs_fsops.c > > +++ b/fs/xfs/xfs_fsops.c > > @@ -379,6 +379,7 @@ xfs_reserve_blocks( > > int64_t fdblks_delta = 0; > > uint64_t request; > > int64_t free; > > + unsigned int tries; > > int error = 0; > > > > /* If inval is null, report current values and return */ > > @@ -430,9 +431,16 @@ xfs_reserve_blocks( > > * If the request is larger than the current reservation, reserve the > > * blocks before we update the reserve counters. Sample m_fdblocks and > > * perform a partial reservation if the request exceeds free space. > > + * > > + * The loop body estimates how many blocks it can request from fdblocks > > + * to stash in the reserve pool. This is a classic TOCTOU race since > > + * fdblocks updates are not always coordinated via m_sb_lock. We also > > + * cannot tell if @free remaining unchanged between iterations is due > > + * to an idle system or freed blocks being consumed immediately, so > > + * we'll try a finite number of times to satisfy the request. > > */ > > error = -ENOSPC; > > - do { > > + for (tries = 0; tries < 30 && error == -ENOSPC; tries++) { > > free = percpu_counter_sum(&mp->m_fdblocks) - > > xfs_fdblocks_unavailable(mp); > > if (free <= 0) > > @@ -459,7 +467,7 @@ xfs_reserve_blocks( > > spin_unlock(&mp->m_sb_lock); > > error = xfs_mod_fdblocks(mp, -fdblks_delta, 0); > > spin_lock(&mp->m_sb_lock); > > - } while (error == -ENOSPC); > > + } > > So the problem here is that if we don't get all of the reservation, > we get none of it, so we try again. This seems like we should simply > punch the error back out to userspace and let it retry rather than > try a few times and then fail anyway. Either way, userspace has to > handle failure to reserve enough blocks. > > The other alternative here is that we just change the reserve block > limit when ENOSPC occurs and allow the xfs_mod_fdblocks() refill > code just fill up the pool as space is freed. That would greatly > simplify the code - we do a single attempt to reserve the new space, > and if it fails we don't care because the reserve space gets > refilled before free space is made available again. I think that's > preferable to having an arbitrary number of retries like this that's > going to end up failing anyway. How about I try the following tomorrow? spin_lock(...); mp->m_resblks = request; free = percpu_counter_sum(&mp->m_fdblocks) - xfs_fdblocks_unavailable(mp); if (request < mp->m_resblks_avail && free > 0) { ask = min(request - mp->m_resblks_avail, free); spin_unlock(...); error = xfs_mod_fdblocks(mp, -ask, 0); spin_lock(...); if (!error) mp->m_resblks_avail += ask; } spin_unlock(...); return error; --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx