Re: [PATCH 2/2] ext4: shrink directories on dentry delete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mar 26, 2020, at 1:49 PM, harshad shirwadkar <harshadshirwadkar@xxxxxxxxx> wrote:
> 
> On Wed, Mar 25, 2020 at 3:06 AM Andreas Dilger <adilger@xxxxxxxxx> wrote:
>> 
>> On Mar 25, 2020, at 3:37 AM, Harshad Shirwadkar <harshadshirwadkar@xxxxxxxxx> wrote:
>>> But note that most of the shrinking happens during last 1-2% deletions
>>> in an average case. Therefore, the next step here is to merge dx nodes
>>> when possible. That can be achieved by storing the fullness index in
>>> htree nodes. But that's an on-disk format change. We can instead build
>>> on tooling added by this patch to perform reverse lookup on a dx
>>> node and then reading adjacent nodes to check their fullness.
>> 
>> Thank you for updating these patches again.  I haven't had a chance to look
>> at them yet, but I hope to review the patches in the near future.
>> 
>> As for storing the fullness on disk changing the on-disk format...  That is
>> true, but the original htree implementation anticipated this and reserved
>> space in the htree index to store the fullness, so it would not break the
>> ability of older kernels to access directories with the fullness information.
>> 
> Yeah, you are right, good to know that we have bits reserved already
> and that wouldn't break older kernels if we use these in future.
>> I think if you used just a few bits (maybe just 2) to store:
>> 0 = unset (every directory today)
>> 1 = under 20% full
>> 2 = under 40% full
>> 3 = under 60% full
>> 
>> or similar.  It doesn't matter if they are more full since they won't be
>> candidates for merging, and then lazily update the htree index fullness
>> as entries are removed, this will simplify the shrinking process, and will
>> avoid the need to repeatedly scan the leaf blocks to see if they are empty
>> enough for merging.  It wouldn't be any worse *not* to store these values
>> on disk after the first time a "0 = unset" entry was found and not merged,
>> or setting the fullness on the merged block if it is merged, and running
>> "e2fsck -D" can easily update the fullness values.
>> 
>> The benefit of using 20%, 40%, and 60% as the fullness markers is that it
>> is possible to either merge adjacent 60% and 40% blocks or alternately a
>> 60% and two adjacent 20% blocks.  Also, since these values are very coarse
>> they would not need to be updated frequently.  If the values are slightly
>> outdated, then it is again not worse than the "always scan" model (one scan
>> and the fullness would be updated), but more efficient than repeat scanning.
>> 
>> Using only two bits for fullness also leaves two bits free for future use.
> 
> Thanks Andreas, that makes sense. This kind of merging will require
> lot of tooling provided in this patch - for example swapping out freed
> block with last block to not leave any holes. So, my hope is that we
> get this patch in first and thereby get a step closer to coalescing
> solution.

Definitely I *do not* want to block the landing of these initial patches
until a "full featured" directory shrinking is complete.  These patches
at least provide some basic functionality, and will at least shrink a
large directory if it becomes totally empty so I'm in favour of that.

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux