Re: [PATCH 25/43] xfs: add support for zoned space reservations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 13, 2024 at 01:01:40PM -0800, Darrick J. Wong wrote:
> > +#define XFS_ZR_GREEDY		(1U << 0)
> > +#define XFS_ZR_NOWAIT		(1U << 1)
> > +#define XFS_ZR_RESERVED		(1U << 2)
> 
> What do these flag values mean?  Can we put that into comments?

Sure.

> > + * For XC_FREE_RTAVAILABLE only the smaller reservation required for GC and
> > + * block zeroing is excluded from the user capacity, while XC_FREE_RTEXTENTS
> > + * is further restricted by at least one zone as well as the optional
> > + * persistently reserved blocks.  This allows the allocator to run more
> > + * smoothly by not always triggering GC.
> 
> Hmm, so _RTAVAILABLE really means _RTNOGC?  That makes sense.

Yes, it means block available without doing further work.
I can't say _RTNOGC is very descriptive either, but I would not mind
a better name if someone came up with a good one :)

> > +		spin_unlock(&zi->zi_reservation_lock);
> > +		schedule();
> > +		spin_lock(&zi->zi_reservation_lock);
> > +	}
> > +	list_del(&reservation.entry);
> > +	spin_unlock(&zi->zi_reservation_lock);
> 
> Hmm.  So if I'm understanding correctly, threads wanting to write to a
> file try to locklessly reserve space from RTAVAILABLE.

At least if there are no waiters yet, yes.

> If they can't
> get space because the zone is nearly full / needs gc / etc then everyone
> gets to wait FIFO style in the reclaim_reservations list.

Yes (In a way modelled after the log grant waits).

> They can be
> woken up from the wait if either (a) someone gives back reserved space
> or (b) the copygc empties out this zone.
> 
> Or if the thread isn't willing to wait, we skip the fifo and either fail
> up to userspace

Yes.

> or just move on to the next zone?

No other zone to move to.

> I think I understand the general idea, but I don't quite know when we're
> going to use the greedy algorithm.  Later I see XFS_ZR_GREEDY gets used
> from the buffered write path, but there doesn't seem to be an obvious
> reason why?

Posix/Linux semantics for buffered writes require us to implement
short writes.  That is if a single (p)write(v) syscall for say 10MB
only find 512k of space it should write those instead of failing
with ENOSPC.  The XFS_ZR_GREEDY implements that by backing down to
what we can allocate (and the current implementation for that is
a little ugly, I plan to find some time for changes to the core
percpu_counters to improve this after the code is merged).





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux