On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote: > On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop > > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making > > all apps nearly unusuable (granted the problem is partly Linux I/O > > scheduler too). So I wonder if we can reduce pack-objects memory > > footprint a bit. > > Next low hanging fruit item: > > struct revindex_entry { > off_t offset; > unsigned int nr; > }; > > We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If > we break this struct apart and store two arrays of offset and nr in > struct packed_git, we save 4 bytes per struct, 26 MB total. > > It's getting low but every megabyte counts for me, and it does not > look like breaking this struct will make horrible code (we recreate > the struct at find_pack_revindex()) so I'm going to do this too unless > someone objects. There will be slight performance regression due to > cache effects, but hopefully it's ok. Maybe you will prove me wrong, but I don't think splitting them is going to work. The point of the revindex_entry is that we sort the (offset,nr) tuple as a unit. Or are you planning to sort it, and then copy the result into two separate arrays? I think that would work, but it sounds kind of nasty (arcane code, and extra CPU work for systems that don't care about the 26MB). -Peff