Re: [PATCH 0/3] pack-objects: simplify add_objects_in_unpacked_packs()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 29, 2021 at 10:48:43PM -0400, Taylor Blau wrote:

> This short series is extracted from mine and Peff's work on cruft packs. These
> three patches focus on cleaning up add_objects_in_unpacked_packs(), which is
> used to implement `git repack -k`.
> 
> The pay-off for this clean-up is significant, though: we net -50 lines of code,
> and the result is much more readable, at least in my opinion.
> 
> The changes are described in detail in the patch messages, but essentially we
> are replacing a loop over get_all_packs() with for_each_packed_object() after
> adding a couple of new flags necessary to make the switch. And once we are done
> with that, the third patch removes a bit from the object flag allocation table.

Thanks for extracting these from that other work. I gave them all an
extra read-through and didn't find anything wrong.

>  builtin/pack-objects.c | 85 ++++++------------------------------------
>  object-store.h         |  6 +++
>  object.h               |  1 -
>  packfile.c             |  6 +++
>  4 files changed, 24 insertions(+), 74 deletions(-)

Of course I love this diffstat, and I _really_ love dropping the one-off
bit from the flag allocation table. But I wanted to also give some
numbers for the lookup_unknown_object() part of the final patch.

Those unknown objects cost 72 bytes of heap each (the same size as a
commit, since it's the biggest struct and the unknown object is a
union). We've seen some real-world cases where there are 40M+
unreachable objects. So that's almost 3GB of wasted RAM during
pack-objects just to store those "did I see it already" bits. :)

Of course, we are already spending 96 bytes per object in "struct
object_entry", but at least that's doing more useful stuff. :)

(And the more obvious question is: why not delete those objects. The
answer thus far has been: because git's pruning is racy, and thus we
don't run it as part of our automatic maintenance).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux