Re: If I were redoing git from scratch...

Linus Torvalds <torvalds@xxxxxxxx> · Sat, 4 Nov 2006 15:15:26 -0800 (PST)

On Sat, 4 Nov 2006, Linus Torvalds wrote:
> 
> In addition to that, we need one pointer per hash entry, and in order to 
> keep the hash list size down we need that hash array to be about 25% free, 
> so say 1.5 pointers per object: ~6 bytes or ~12 bytes depending on whether 
> it's a 32- or 64-bit architecture.

Btw, one of the things I considered (but rejected as being _too_ far out 
for now) during the memory shrinking thing was to make both 32-bit and 
64-bit entities use a 32-bit hash table entry.

The way to do that would be to instead of using a pointer, use a 32-bit 
integer where the low ~10 bits are an index into the allocation buffer 
(since we batch allocations), and the rest of the bits would be an index 
into which batch-buffer it is.

Exactly because 8 bytes per hash entry is actually right now a big part of 
the object memory allocation overhead on 64-bit architectures, and cutting 
it down to just 4 bytes would help save memory.

I never got around to it, if only because I actually just compile my 
user-land git stuff as 32-bit, even on my ppc64 system. And partly because 
I had shrunk the object allocations enough that I just felt pretty happy 
with it anyway, and the change would have been pretty painful. But on 
64-bit architectures, the hash table right now is about a third of the 
whole memory overhead of the object database, and cutting it down by half 
would actually be noticeable.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html