Re: [PATCH] xfsdocs: capture some information about dirs vs. attrs and how they use dabtrees

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 09, 2020 at 10:16:08AM +1000, Dave Chinner wrote:
> On Wed, Apr 08, 2020 at 04:27:53PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > 
> > Dave and I had a short discussion about whether or not xattr trees
> > needed to have the same free space tracking that directories have, and
> > a comparison of how each of the two metadata types interact with
> > dabtrees resulted.  I've reworked this a bit to make it flow better as a
> > book chapter, so here we go.
> > 
> > Original-mail: https://lore.kernel.org/linux-xfs/20200404085203.1908-1-chandanrlinux@xxxxxxxxx/T/#mdd12ad06cf5d635772cc38946fc5b22e349e136f
> > Originally-from: Dave Chinner <david@xxxxxxxxxxxxx>
> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> 
> Couple of things.
> 
> We are talking about btrees and where the record data is being
> stored (internal or external). Hence I think it makes sense to refer
> to "attribute records" and "directory records" (or "dirent records")
> rather than "attributes" and "directory entries"...

Ok, I'll clean that up 

> "leaves" -> "leaf nodes"

Fixed.

> > ---
> >  .../extended_attributes.asciidoc                   |   49 ++++++++++++++++++++
> >  1 file changed, 49 insertions(+)
> > 
> > diff --git a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> > index 99f7b35..d61c649 100644
> > --- a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> > +++ b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc
> > @@ -910,3 +910,52 @@ Log sequence number of the last write to this block.
> >  
> >  Filesystems formatted prior to v5 do not have this header in the remote block.
> >  Value data begins immediately at offset zero.
> > +
> > +== Key Differences Between Directories and Extended Attributes
> > +
> > +Though directories and extended attributes can take advantage of the same
> > +variable length record btree structures (i.e. the dabtree) to map name hashes
> > +to disk blocks, there are major differences in the ways that each of those
> > +users embed the btree within the information that they are storing.
> > +
> > +Directory blocks require external free space tracking because the directory
> > +blocks are not part of the dabtree itself.  The dabtree leaves for a directory
> > +map name hashes to external directory data blocks.  Extended attributes, on
> 
> "The dabtree leaves for ...." implies it is going somewhere, not
> that you are talking about leaf nodes. :) Perhaps:
> 
> "The directory dabtree leaf nodes contain a mapping between name
> hash and the location of the dirent record in the external directory
> data blocks."

<nod>

> > +the other hand, store all of the attributes in the leaves of the dabtree.
> 
> "... store the attribute records directly in the dabtree leaf
> nodes."

<nod>

> > +
> > +When we add or remove an extended attribute in the dabtree, we split or merge
> > +leaves of the tree based on where the name hash index tells us a leaf needs to
> > +be inserted into or removed.  In other words, we make space available or
> > +collapse sparse leaves of the dabtree as a side effect of inserting or
> > +removing attributes.
> > +
> > +The directory structure is very different.  Directory entries cannot change
> > +location because each entry's logical offset into the directory data segment
> > +is used as the readdir/seekdir/telldir cookie, and the cookie is required to
> > +be stable for the life of the entry.  Therefore, we cannot store directory
> > +entries in the leaves of a dabtree (which is indexed in hash order) because
> 
> The userspace readdir/seekdir/telldir directory cookie API places a
> requirement on the directory structure that dirent record cookie
> cannot change for the life of the dirent record. We use the dirent
> record's logical offset into the directory data segment for that
> cookie, and hence the dirent record cannot change location.
> Therefore, we cannot store directory records in the leaf nodes of
> the dabtree....

Ok, I'll massage that in. :)

> > +the offset into the tree would change as other entries are inserted and
> > +removed.  Hence when we remove directory entries, we must leave holes in the
> > +data segment so the rest of the entries do not move.
> > +
> > +The directory name hash index (the dabtree bit) is held in the second
> > +directory segment.  Because the dabtree only stores pointers to directory
> > +entries in the (first) data segment, there is no need to leave holes in the
> > +dabtree itself.  The dabtree merges or splits leaves as required as pointers
> > +to the directory data segment are added or removed.  The dabtree itself needs
> > +no free space tracking.
> > +
> > +When we go to add a directory entry, we need to find the best-fitting free
> 
> s/go to//

Fixed.

> > +space in the directory data segment to turn into the new entry.  This requires
> > +a free space index for the directory data segment.  The free space index is
> > +held in the third directory segment.  Once we've used the free space index to
> > +find the block with that best free space, we modify the directory data block
> > +and update the dabtree to point the name hash at the new entry.
> > +
> > +In other words, the requirement for a free space map in the directory
> > +structure results from storing the directory entry data externally to the
> > +dabtree.  Extended atttributes are stored directly in the leaves of the
> 
> dabtree leaf nodes

Fixed.

> > +dabtree (except for remote attributes which can be anywhere in the attr fork
> > +address space) and do not need external free space tracking to determine where
> > +to best insert them.  As a result, extended attributes exhibit nearly perfect
> > +scaling until we run out of memory.
> 
> Thanks for doing this, Darrick!

NP.  v2 is on its way.

--D

> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux