On Wed, Apr 08, 2020 at 08:45:12AM -0700, Darrick J. Wong wrote: > On Tue, Apr 07, 2020 at 09:30:02AM +1000, Dave Chinner wrote: > > Long story short - there's no degradation in > > performance in the dabtree out to tens of millions of records with > > different fixed size or random sized attributes, nor does various > > combinations of insert/lookup/remove/replace operations seem to > > impact the tree performance at scale. IOWs, we hit the 16 bit extent > > limits of the attribute trees without finding any degradation in > > performance. > > Ok. I'll take "attr v3 upgrade" off my list of things to look out for. > > > Hence we concluded that the dabtree structure does not require > > significant modification or optimisation to work well with typical > > parent pointer attribute demands... > > > > As for free space indexes.... > > > > The issue with the directory structure that requires external free > > space is that the directory data is not part of the dabtree itself. > > The attribute fork stores all the attributes at the leaves of the > > dabtree, while the directory structure stores the directory data in > > external blocks and the dabtree only contains the name hash index > > that points to the external data. > > > > i.e. When we add an attribute to the dabtree, we split/merge leaves > > of the tree based on where the name hash index tells us it needs to > > be inserted/removed from. i.e. we make space available or collapse > > sparse leaves of the dabtree as a side effect of inserting or > > removing objects. > > > > The directory structure is very different. The dirents cannot change > > location as their logical offset into the dir data segment is used > > as the readdir/seekdir/telldir cookie. Therefore that location is > > not allowed to change for the life of the dirent and so we can't > > store them in the leaves of a dabtree indexed in hash order because > > the offset into the tree would change as other entries are inserted > > and removed. Hence when we remove dirents, we must leave holes in > > the data segment so the rest of the dirent data does not change > > logical offset. > > > > The directory name hash index - the dabtree bit - is in a separate > > segment (the 2nd one). Because it only stores pointers to dirents in > > the data segment, it doesn't need to leave holes - the dabtree just > > merge/splits as required as pointers to the dir data segment are > > added/removed - and has no free space tracking. > > > > Hence when we go to add a dirent, we need to find the best free > > space in the dir data segment to add that dirent. This requires a > > dir data segment free space index, and that is held in the 3rd dir > > segment. Once we've found the best free space via lookup in the > > free space index, we go modify the dir data block it points to, then > > update the dabtree to point the name hash at that new dirent. > > > > IOWs, the requirement for a free space map in the directory > > structure results from storing the dirent data externally to the > > dabtree. Attributes are stored directly in the leaves of the > > dabtree - except for remote attributes which can be anywhere in the > > BMBT address space - and hence do no need external free space > > tracking to determine where to best insert them... > > <nod> Got it. I've suspected this property about the xattr structures > for a long time, so I'm glad to hear someone else echo that. :) > > Dave: May I try to rework the above into something suitable for the > ondisk format documentation? Sure. Anything that helps people understand the complexity of the directory data structure is a good thing :) -Dave. -- Dave Chinner david@xxxxxxxxxxxxx