Stefan Beller <stefanbeller@xxxxxxxxxxxxxx> writes: > A little background on hash tables first: > Consider you want to have the object X, which you'd expect at position > i, but because that place was already taken by B, it is not found at > position i, you start looking right of position i to find X until you > find it. > > index | i-1 | i | i+1 | i+2 | i+3 | i+4 | > -------------------------------------------- > entry ... | A | B | C | D | E | X | ... > -------------------------------------------- > > Once you have found X at i+4, the commit 9a414486d9f0 (2013-05-01, > lookup_object: prioritize recently found objects) did an optimization > and swapped the object B with X, so the placement looks like > > index | i-1 | i | i+1 | i+2 | i+3 | i+4 | > -------------------------------------------- > entry ... | A | X | C | D | E | B | ... > -------------------------------------------- [...] > If we now want to find X and X is expected at i, we put X to > the position i and B to the middle position between B and X at D > and D will go to the old position of X: > > index | i-1 | i | i+1 | i+2 | i+3 | i+4 | > -------------------------------------------- > entry ... | A | X | C | B | E | D | ... > -------------------------------------------- > > So let's test how it works out: > # running the current git.git master (c1ebd90c832e), repeat 25 times: > perf stat -r 25 -- ./git-rev-list --all --objects >/dev/null > ... > 5.294512141 seconds time elapsed ( +- 7.88% ) > # Now running with this patch applied: > 5.111576725 seconds time elapsed ( +- 8.17% ) [...] > However please do check if this patch brings the promised performance > on your own, as you're likely using different hardware and another > software setup. Feel free to share your performance differences. I get this on an i7-M620 laptop from t/perf/p0001-rev-list.sh: Test HEAD next ------------------------------------------------------------------------------- 0001.1: rev-list --all 6.29(6.03+0.22) 6.33(6.06+0.24) +0.6% 0001.2: rev-list --all --objects 53.22(52.48+0.54) 54.90(54.15+0.55) +3.2%* ------------------------------------------------------------------------------- Significance hints: '.' 0.1 '*' 0.05 '**' 0.01 '***' 0.001 And this on a newer i7-3930K workstation: Test HEAD next ------------------------------------------------------------------------------- 0001.1: rev-list --all 3.86(3.71+0.14) 3.89(3.74+0.14) +0.7%* 0001.2: rev-list --all --objects 30.23(29.91+0.29) 30.35(30.04+0.29) +0.4%. ------------------------------------------------------------------------------- Significance hints: '.' 0.1 '*' 0.05 '**' 0.01 '***' 0.001 That's using linux.git as a test repository, I figured the numbers were too small with git.git. I trust the laptop numbers less because it has far more thermal (and thus throttling) issues, but the runs do show a significant difference, though less than you claimed. > @@ -90,9 +90,27 @@ struct object *lookup_object(const unsigned char *sha1) > * Move object to where we started to look for it so > * that we do not need to walk the hash table the next > * time we look for it. > + * However, we don't want to penalize the the object being > + * moved from the first position, so divide the penalty to > + * two different objects. > */ > + > + /* > + * First make sure i > first, so the middle is really > + * in between the i and first object > + */ > + if (i < first) > + i += obj_hash_size; > + > + middle = (i + first) / 2; > + if (i >= obj_hash_size) > + i -= obj_hash_size; > + if (middle >= obj_hash_size) > + middle -= obj_hash_size; > + > struct object *tmp = obj_hash[i]; Adding all the code above means that this declaration is now in the middle of things, which is not valid C90. Please move the declaration to the beginning of the block, and compile with -Wdeclaration-after-statement to notice this issue. > - obj_hash[i] = obj_hash[first]; > + obj_hash[i] = obj_hash[middle]; > + obj_hash[middle] = obj_hash[first]; > obj_hash[first] = tmp; > } > return obj; -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html