This is a series of three patches that changes the low-level object hashing to use a "object index" rather than the pointer to a "struct object" in the hash-tables. It's something I've been thinking about for a long time, so I just decided to do it. The reason to do it is that on 64-bit architectures the object hash table is actually a fairly sizeable entity, and not for a very good reason. It has a ton of pointers to the objects we have allocated, so each hash-table entry is 64-bits, even though obviously we aren't likely to ever have that many objects. So instead, we could use a 32-bit index into an object table - and in fact, since we already do all normal object allocations using a special dense allocatory that allocates 1024 objects in one go, we already kind of were set up for this, with the low 10 bits of the object index being a very natural index into each allocation block. Could we ever want more than 4 billion objects? Unlikely, since you'd actually need 80GB of memory just to keep track of the object names in such a hash table, but hey, if that day ever comes, we can certainly trivially make the index be 64-bit instead (or more likely, make it be 48-bit and use 16 bits of the hash table entry as an extended hash value or something). Anyway, the before-and-after numbers are somewhat debatable, so this is purely a request for discussion.. Before: [torvalds@woody linux]$ /usr/bin/time git-rev-list --all --objects | wc -l 5.66user 0.46system 0:06.12elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+44389minor)pagefaults 0swaps 445065 After: [torvalds@woody linux]$ /usr/bin/time ~/git/git-rev-list --all --objects | wc -l 6.96user 0.36system 0:07.36elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+40240minor)pagefaults 0swaps 445065 ie it's actually slightly slower, but it uses almost 10% less memory (minor page faults). Is it worth it? Probably not, but since I made the patches, I thought I'd post them anyway. And the two first patches are probably worth applying regardless - it's only the third patch that actually changes things to use a hash index. Anyway, the three patches are: 0001-Use-proper-object-allocators-for-unknown-object-node.patch 0002-Clean-up-object-creation-to-use-more-common-code.patch 0003-Make-the-object-lookup-hash-use-a-object-index-ins.patch where 1-2 are pretty much just cleanups. Comments? Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html