On Fri, Mar 2, 2018 at 5:54 PM, Jeff King <peff@xxxxxxxx> wrote: > On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote: > >> On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: >> > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop >> > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making >> > all apps nearly unusuable (granted the problem is partly Linux I/O >> > scheduler too). So I wonder if we can reduce pack-objects memory >> > footprint a bit. >> >> Next low hanging fruit item: >> >> struct revindex_entry { >> off_t offset; >> unsigned int nr; >> }; >> >> We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If >> we break this struct apart and store two arrays of offset and nr in >> struct packed_git, we save 4 bytes per struct, 26 MB total. >> >> It's getting low but every megabyte counts for me, and it does not >> look like breaking this struct will make horrible code (we recreate >> the struct at find_pack_revindex()) so I'm going to do this too unless >> someone objects. There will be slight performance regression due to >> cache effects, but hopefully it's ok. > > Maybe you will prove me wrong, but I don't think splitting them is going > to work. The point of the revindex_entry is that we sort the (offset,nr) > tuple as a unit. > > Or are you planning to sort it, and then copy the result into two > separate arrays? Yep. > I think that would work, but it sounds kind of nasty Yeah :( > (arcane code, and extra CPU work for systems that don't care about the > 26MB). -- Duy