On Wed, Feb 28, 2018 at 04:27:22PM +0700, Duy Nguyen wrote: > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making > all apps nearly unusuable (granted the problem is partly Linux I/O > scheduler too). So I wonder if we can reduce pack-objects memory > footprint a bit. Yeah, the per object memory footprint is not great. Around 100 million objects it becomes pretty ridiculous. I started to dig into it a year or three ago when I saw such a case, but it turned out to be something that we could prune. The torvalds/linux fork network has ~23 million objects, so it's probably 7-8 GB of book-keeping. Which is gross, but 64GB in a server isn't uncommon these days. I think laptops repacking the kernel are probably one of the worst cases (leaving aside the awful Windows repository, but my impression is that they simply can't do a full repack at all there). > This demonstration patch (probably breaks some tests) would reduce the > size of struct object_entry from from 136 down to 112 bytes on > x86-64. There are 6483999 of these objects, so the saving is 17% or > 148 MB. 136 x 6.5M objects is only about 800MB. I suspect a big chunk of the rest is going to the object structs we create when doing the internal rev-list traversal. And those duplicate the 20-byte sha1s at the very least. I don't know if it would be a good idea to free them after the traversal, though. We do use them again later in the bitmap case. On the other hand, we could probably do so for the non-bitmap case. And even for the bitmap case, the primary value in keeping them around is that the parent pointers will already be cached. So it might make sense to free the blobs and trees (though it might be tricky; the commits have pointers to the trees). It also doesn't help with peak memory usage, because you'll have the full to_pack list and all of the "struct object" in memory together at one point. Another option would be to somehow replace the pack_idx_entry with a reference to a "struct object". That would let us at least avoid storing the 20-byte oid twice. > If we go further, notice that nr_objects is uint32_t, we could convert > the three pointers > > struct object_entry *delta; > struct object_entry *delta_child; > struct object_entry *delta_sibling; > > to > > uint32_t delta; > uint32_t delta_child; > uint32_t delta_sibling; > > which saves 12 bytes (or another 74 MB). 222 MB total is plenty of > space to keep some file cache from being evicted. Yeah, that seems like low-hanging fruit. I'd also note that we don't actually use all of the fields during the whole process. I think some of those delta fields are only used for a short time. So we might be able to reduce peak memory if there are some mutually exclusive bits of each entry (and even if there's some overlap, we'd reduce the length of time we'd need to be at peak). > Is it worth doing this? The struct packing makes it harder to read > (and more fragile too). I added some more artifical limit like max > depth of 2^11. But I think 4096+ depth is getting unreasonable. I'm OK with a limit like 4096, as long as we notice when we hit the limit and behave reasonably. I think the algorithm in break_delta_chains() may temporarily store up to uint32_t depth. But I think we won't ever write anything into cur->depth larger than the max-depth limit. So it would probably be sufficient to just check that the --depth argument is reasonably sized and complain otherwise. I do agree this makes things a bit harder to read, but I think the benefits are pretty measurable. And may make a real usability difference on a large repository. -Peff