[PATCH 00/17 V3] XFS: LARP state machine and recovery rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oops, I screwed up the the subject line. Let's fix that up so
that archive searches can find it easily.

On Fri, May 06, 2022 at 07:45:36PM +1000, Dave Chinner wrote:
> [PATCH 00/17 V3] XFS: LARP state machine and recovery rework
> 
> This patchset aims to simplify the state machine that the new logged
> attributes require to do their stuff. It also reworks the
> attribute logging and recovery algorithms to simplify them and to
> avoid leaving incomplete attributes around after recovery.
> 
> When I first dug into this code, I modified the state machine
> to try to simplify the set and replace operations as there was
> a lot of duplicate code in them. I also wasn't completely happy with
> the different transistions for replace operations when larp mode was
> enabled. I simplified the state machien and renumbered the states so
> that we can iterate through the same state progression for different
> attr formats just by having different inital leaf and node states.
> 
> Once I'd largely done this and started testing it, I realised that
> recovery wasn't setting the initial state properly, so I started
> digging into the recovery code. At this point, I realised the
> recovery code didn't work correctly in all cases and could often
> leave unremovable incomplete attrs sitting around. The issues are
> larger documented in the last patch in the series, so I won't go
> over them here, just read that patch.
> 
> However, to get, the whole replace operation for LARP=1 needed to
> change. Luckily, that turned out to be pretty simple because it was
> largely already broken down into component operations in the state
> machine. hence I just needed to add new "remove" initial states to
> the set_iter state machine, and that allowed the new algorithm to
> function.
> 
> Then I realised that I'd just implemented the remove_iter algorithm
> in the set_iter state machine, so I removed the remove_iter code and
> it's states altogether and just pointed remove ops at the set-iter
> remove initial states. The code now uses the XFS_DA_OP_RENAME flag
> to determine if it should follow up an add or remove with a remove
> or add, and it all largely just works. All runtime algorithms run
> throught he same state machine just with different initial states
> and state progressions.
> 
> And with the last patch in the series, attr item log recovery uses
> that same state machine, too. It has a few quirks that need to be
> handled, so I added the XFS_DA_OP_RECOVERY flag to allow the right
> thing to be done with the INCOMPLETE flag deep in the guts of the
> attr lookup code. And so recovery with LARP=1 now seems to mostly
> work.
> 
> This version passes fstests, several recoveryloop passes and the
> targeted error injection based attr recovery test that Catherine
> wrote. There's a fair bit of re-org in the patch series since V2,
> but most of that is pulling stuff from the last patch and putting it
> in the right place in the series. hence the only real logic iand bug
> fixes changes occurred in the last patch that changes the logging 
> and recovery algorithm.
> 
> Comments, reviews and, most especially, testing welcome.
> 
> A compose of the patchset I've been testing based on the current
> for-next tree can be found here:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git xfs-5.19-compose
> 
> Version 3:
> - rebased on 5.18-rc2 + for-next + rebased larp v29
> - added state asserts to xfs_attr_dela_state_set_replace()
> - only roll the transactions in ALLOC_RMT when we need to do more
>   extent allocation for the remote attr value container.
> - removed unnecessary attr->xattri_blkcnt check in ALLOC_RMT
> - added comments to commit message to explain why we are combining
>   the set and remove paths.
> - preserved and isolated the state path save/restore code for
>   avoiding repeated name entry path lookups when rolling transactins
>   across remove operations.  Left a big comment for Future Dave to
>   re-enable the optimisation.
> - fixed a transient attr fork removal bug in
>   xfs_attr3_leaf_to_shortform() in the new removal algorithm which
>   can result in the attr fork being removed between the remove and
>   set ops in a REPLACE operation.
> - moved defer operation setup changes ("split replace from set op")
>   to early on in the series and combined it with the refactoring
>   done immediately afterwards in the last patch of the series. This
>   allows for cleanly fixing the log recovery state initialisation
>   problem the patchset had.
> - pulled the state initialisation for log recovery up into the
>   patches that introduce the state machine changes for the given
>   operations.
> - Lots of changes to log recovery algorithm change in last patch.
> 
> Version 2:
> https://lore.kernel.org/linux-xfs/20220414094434.2508781-1-david@xxxxxxxxxxxxx/
> - fixed attrd initialisation for LARP=1 mode
> - fixed REPLACE->REMOVE_OLD state transition for LARP=1 mode
> - added more comments to describe the assumptions that allow
>   xfs_attr_remove_leaf_attr() to work for both modes
> 
> 

-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux