Re: [RFC DELUGE v9r2d1] xfs: Parent Pointers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2023-02-16 at 12:06 -0800, Darrick J. Wong wrote:
> Hi everyone,
> 
> This deluge contains all of the additions to the parent pointers
> patchset that I've been working on for the past month.  The kernel
> and
> xfsprogs patchsets are based on Allison's v9r2 tag from last week;
> the fstests patches are merely a part of my development tree.  To
> recap
> Allison's cover letter:
> 
> "The goal of this patch set is to add a parent pointer attribute to
> each
> inode.  The attribute name containing the parent inode, generation,
> and
> directory offset, while the  attribute value contains the file name.
> This feature will enable future optimizations for online scrub,
> shrink,
> nfs handles, verity, or any other feature that could make use of
> quickly
> deriving an inodes path from the mount point."
> 
> The kernel branches start with a number of buf fixes that I need to
> get
> fstests to pass.  I also restructured the kernel implementation of
> GETPARENTS to cut the memory usage considerably.
> 
> For userspace, I cleaned up the xfsprogs patches so that libxfs-diff
> shows no discrepancies with the kernel and cleaned up the parent
> pointer
> usage code that I prototyped in 2017 so that it's less buggy and
> moldy.
> I also rewired xfs_scrub to use GETPARENTS to report file paths of
> corrupt files instead of inode numbers, since that part had bitrotted
> badly.
> 
> With that out of the way, I implemented a prototype of online repairs
> for directories and parent pointers.  This is only a proof of
> concept,
> because I had already backported many many patches from part 1 of
> online
> repair, and didn't feel like porting the parts needed to commit new
> structures atomically and reap the old dir/xattr blocks.  IOWs, the
> prototype scans the filesystem to build a parallel directory or xattr
> structure, and then reports on any discrepancies between the two
> versions.  Obviously this won't fix a corrupt directory tree, but it
> enables us to test the repair code on a consistent filesystem to
> demonstrate that it works.
> 
> Next, I implemented fully functional parent pointer checking and
> repair
> for xfs_repair.  This was less hard than I guessed it would be
> because
> the current design of phase 6 includes a walk of all directories. 
> From
> the dirent data, we can build a per-AG index of all the parent
> pointers
> for all the inodes in that AG, then walk all the inodes in that AG to
> compare the lists.  As you might guess, this eats a fair amount of
> memory, even with a rudimentary dirent name deduplication table to
> cut
> down on memory usage.
> 
> After that, I moved on to solving the major problem that I've been
> having with the directory repair code, and that is the problem of
> reconstructing dirents at the offsets specified by the parent
> pointers.
> The details of the problem and how I dealt with it are captured in
> the
> cover letter for those patches.  Suffice to say, we now encode the
> dirent name in the parent pointer attrname (or a collision resistant
> hash if it doesn't fit), which makes it possible to commit new
> directories atomically.
> 
> The last part of this patchset reorganizes the XFS_IOC_GETPARENTS
> ioctl
> to encode variable length parent pointer records in the caller's
> buffer.
> The denser encodings mean that we can extract the parent list with
> fewer
> kernel calls.
> 
> --D


Ermergersh, thats a lot!  Thanks for all the hard work.  I feel like if
we don't come up with a plan for review though, people may not know
where to start for these deluges!  Lets see... if we had to break this
down, I think would divide it up between the existing parent pointers
and the new pptr propositions for ofsck.  Then further divide it among
kernel space, user space and test case.  If I had to pick only one of
these to focus attention on, probably it should be new ofsck changes in
the kernel space, since the rest of the deluge is really contingent on
it. 

So now we've narrowed this down to a few subsets:

[PATCHSET v9r2d1 0/3] xfs: bug fixes for parent pointers
[PATCHSET v9r2d1 0/4] xfs: rework the GETPARENTS ioctl,
[PATCHSET v9r2d1 00/23] xfs: online fsck support patches
[PATCHSET v9r2d1 0/7] xfs: online repair of directories
[PATCHSET v9r2d1 0/2] xfs: online checking of parent pointers
[PATCHSET v9r2d1 0/3] xfs: online checking of parent pointers
[PATCHSET v9r2d1 0/2] xfs: online checking of directories
[PATCHSET v9r2d1 0/5] xfs: encode parent pointer name in xattr key
[PATCHSET v9r2d1 0/3] xfs: use flex arrays for XFS_IOC_GETPARENTS,

Of those, I think "xfs: encode parent pointer name in xattr key" is the
only one that might impact other features since it's changeing the
ondisk format from when we first started the effort years ago.  So
probably that might be the best place for people to start since if this
needs to change it might impact some of the other subsets in the
deluge, or even features they are working on if they've based anything
on the existing pptr set.

I feel like a 5 patch subset is a very reasonable thing to ask people
to give their attention to.  That way they dont get lost in things like
nits for optimizations that might not even matter if something it
depends on changes.

For the most part I am ok with changeing the format as long as everyone
is aware and in agreement so that we dont get caught up re-coding
efforts that seem to have stuggled with disagreements now on the scale
of decades.  Some of these patches were already very old by the time I
got them!

On a side note, there are some preliminary patches of kernel side
parent pointers that are either larp fixes or refactoring not sensitive
to the proposed ofsck changes.  These patches a have been floating
around for a while now, so if no one has any gripes, I think just
merging those would help cut down the amount of rebaseing, user space
porting and patch reviewing that goes on for every version.  (maybe the
first 1 though 7 of the 28 patch set, if folks are ok with that)

I think the shear size of some of these sets tend to work against them,
as people likely cannot afford the time block they present on the
surface.  So I think we would do well to find a way to introduce them
at a reasonable pace and keep attention focused on the subsections that
should require more than others, and hopefully keep thing moving in a
progressive direction.

Thx!
Allison





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux