Re: [PATCH v3 00/42] xfs: per-ag centric allocation alogrithms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 09, 2023 at 07:09:56PM -0800, Darrick J. Wong wrote:
> On Fri, Feb 10, 2023 at 09:17:43AM +1100, Dave Chinner wrote:
> > This series continues the work towards making shrinking a filesystem possible.
> > We need to be able to stop operations from taking place on AGs that need to be
> > removed by a shrink, so before shrink can be implemented we need to have the
> > infrastructure in place to prevent incursion into AGs that are going to be, or
> > are in the process, of being removed from active duty.
> 
> From a quick glance, it looks like all the random things I had comments
> about were fixed, so for patches 14, 20-23, 28, and 42:
> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> I'm leaving off patch #7 until tomorrow so that I can think about it
> with a non-tired brain.  I didn't see anything obviously wrong in the
> diff itself -- but I still need to adjust my mental model per what Dave
> said in his previously reply (active perag refs are for user-facing
> online operations, passive refs are for internal operations) and
> (re)examine how that relates to scrub and repair.
> 
> Mostly I tripped over "but repair needs to use passive references once
> the AG has had it's state changed to "offline" -- currently, repair uses
> the same perag reference that scrub _gets.  If scrub now gets an
> "active" reference and something needs repair, do we mark the AG offline
> and keep the active reference?  Or downgrade it to a passive reference?

So one of the things I was trying to explain and didn't do a very
good job of is that active references are references that prevent AG
operational state changes.

That is, if we want to take an AG offline, or just prevent new
allocations in an AG, we have to wait for all the active references
to drain before we can change the operational state.

This does not prevent an active reference from being taken when an
AG is offline - all an active reference in an offline state does is
prevent the AG from being put back online whilst that active
reference to the offline AG persists.

e.g. repair can drain all the active references on an online AG,
then mark it offline, then take a new active reference to pin the AG
in the offline state while it does the repair work on that AG.

Similarly, shrink can pin an AG in a "being shrunk" operational
state that allows inodes and extents to be freed, but no new
allocations to be made by draining the active references, changing
the state and then pinning by taking a new reference. Then the
shrink process can move all the user data and metadata out of the AG
without needing special tricks to avoid allocating in that AG. If it
fails to move everything or is aborted, it can drop it's active
reference and put the AG back online...

This means that the new allocation code that now takes active
references will be morphing further to be "grab active reference,
check AG opstate allows allocation, if not drop active reference and
skip AG".

Once an AG is ready to be removed, grabbing an active reference
will fail so at that point the AG is skipped without even getting to
state checks. Once all the passive references then drain, the perag
can be RCU freed.

> I've (tried to) design scrub & repair as if they were just another pile
> of higher level code that uses libxfs to manipulate metadata, just like
> fallocate and reflink and all those types of things.  But then, I was
> designing for a world where there's only one type of AG reference. :)

For the moment, everything in scrub/repair will work just fine with
passive references. It's only once we start using AG opstate to
control allocator, caching and scanning behaviour that they might
want to start using active references...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux