On Mar 31, 2007, at 20:36, Linus Torvalds wrote:
So here's a suggestion:
- start finding a range using the 256-entry fan-out, exactly the
way we
did for the binary search. It's cheap. We could probably avoid EVEN
THIS, and just do one more newton-raphson iteration more. But
since we
have the data, we migth as well use it. After all, it really
*is* just
a first approximation of newton-raphson, and while it uses only
8 bits
(and we could do better), at least it's an *exact* one.
- use newton-raphson to iterate closer. It should be a much faster
way to
find the rough area for the entry we're searching for than binary
search. Two or three iterations should get us there, easily.
- do a linear search once you're close enough.
Actually, I had implemented this first, using two newton-raphson
iterations and then binary search. With just one iteration is
too little, and one iteration+binary search often is no win.
Two iterations followed by binary search cuts the nr of steps in
half for the Linux kernel. Two iterations followed by linear search
is often worse, because of "unlucky" cases that end up doing many
probes. Still, during the 5-8 probes in moderately large repositories
(1M objects), each probe pretty much requires its own cache line:
very cache unfriendly.
By using a fan-out table with an average nr of entries per bin
of 32 or so, you get a total cache footprint of between 1 and 2 bits
per entry for the random lookup. So, a complete lookup only hits
about 3 cache lines. After a few lookups, almost all of the
fan-out table will be in cache, while currently, the cache is
filled with useless 24 byte SHA1+offset of unrelated objects.
-Geert
PS. I'm travelling now, so can't participate in the discussion
or send code as much as I'd like and mail may be late or
out of sequence.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html