Newton-Raphson, was Re: Performance issue of 'git branch'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 23 Jul 2009, Linus Torvalds wrote:
>
> Some googling found this:
> 	http://marc.info/?l=git&m=117537594112450&w=2
> but what got merged (half a year later) was a much fancier thing by Junio.
> See sha1-lookup.c.

Thanks. Edésio Costa e Silva also gave me a useful pointer.

> That original "single iteration of newton-raphson" patch was buggy, but
> it's perhaps interesting as a concept patch.

I think Newton-Raphson is a brilliant but misleading idea. (As Junio said,
"egg of Columbus" - it certainly blew my mind!) However, Newton's method
works with smooth curves, but a pack index is a straight line plus
stochastic deviations. If you try to apply Newton's method then the more
you zoom in the more the random variations will send you away from the
place you want to be. So I think your first N-R patch was closer to being
right than its successors.

What you should do is ONE linear interpolation on the entire index. (i.e.
If you have N objects in the pack and you want to find one with SHA-1 id
S, take the top four bytes of S and multiply by N/2^32.) Note that if you
do a level-1 256-way fan-out lookup first then the random variations will
make you LESS likely to land near the right place.

After doing the first-order linear interpolation, it's probably sensible
to do a page-wise linear search (in case you don't land directly on
the page containing the target SHA-1) then a binary search within the
final page for efficiency with a hot cache.

This should give you O(1) seeks in the index per object lookup.

Tony.
-- 
f.anthony.n.finch  <dot@xxxxxxxx>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]