Junio C Hamano <gitster@xxxxxxxxx> writes: > Thomas Rast <trast@xxxxxxxxxxx> writes: > >> So we take a slightly different approach, and trade some memory for >> better cache locality. > > Interesting. It feels somewhat bait-and-switch to reveal that the > above "some" turns out to be "double" later, but the resulting code > does not look too bad, and the numbers do not look insignificant. Oh, that wasn't the intent. I was too lazy to gather some memory numbers, so here's an estimate on the local effect and some measurements on the global one. struct object is at least 24 bytes (flags etc. and sha1). We grow the hash by 2x whenever it reaches 50% load, so it is always at least 25% loaded. A 25% loaded hash-table used to consist of 75% pointers (8 bytes) and 25% pointers-to-struct-object (32 bytes), for 14 bytes per average slot. Now it's 22 bytes (one more unsigned long) per slot, i.e., a 60% increase for the data managed by the hash table. But that's using the crudest estimates I could think of. If we assume that an average blob and tree is at least as big as the smallest possible commit, we'd guess that objects are at least ~240 bytes (this is still somewhat of an estimate and assumes that you don't go and handcraft commits with single-digit timestamps). So the numbers above go up by 25% * 240 per average slot, and work out to an about 11% overall increase. Here are some real numbers from /usr/bin/time git rev-list --all --objects: before: 2.30user 0.02system 0:02.33elapsed 99%CPU (0avgtext+0avgdata 247760maxresident)k 0inputs+0outputs (0major+17844minor)pagefaults 0swaps after: 2.18user 0.02system 0:02.21elapsed 99%CPU (0avgtext+0avgdata 261936maxresident)k 0inputs+0outputs (0major+18202minor)pagefaults 0swaps So that would be about 14MB or 5.7% of extra memory. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html