Re: Fix a pathological case in git detecting proper renames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 29 Nov 2007, Linus Torvalds wrote:
> 
> It's worth noting a few gotchas:
> 
>  - this scoring is currently only done for the "exact match" case. 
> 
>    In particular, in Kumar's example, even after this patch, the inexact
>    match case is still done as a copy+delete rather than as two renames:
> 
> 	 delete mode 100644 board/cds/mpc8555cds/u-boot.lds
> 	 copy board/{cds => freescale}/mpc8541cds/u-boot.lds (97%)
> 	 rename board/{cds/mpc8541cds => freescale/mpc8555cds}/u-boot.lds (97%)
> 
>    because apparently the "cds/mpc8541cds/u-boot.lds" copy looked 
>    a bit more similar to both end results. That said, I *suspect* we just 
>    have the exact same issue there - the similarity analysis just gave 
>    identical (or at least very _close_ to identical) similarity points, 
>    and we do not have any logic to prefer multiple renames over a 
>    copy/delete there.
> 
>    That is a separate patch.

Side note: just in case people were expecting me to actually _ship_ that 
separate patch that handles the fuzzy matches too.. I wasn't planning on 
doing that patch. The way the fuzzy rename detection is currently done, 
that's actually quite painful.

For the fuzzy rename detection, we generate the full score matrix, and 
sort it by the score, up front. So all the scoring - and more importantly, 
all the sorting - has actually been done before we actually start looking 
at *any* renames at all, so we cannot easily do the same thing I did for 
the exact renames, namely to take into account _earlier_ renames in the 
scoring. Because those earlier renames have simply not been done when the 
score is calculated.

This would probably become easier to do with the linear-time hash-based 
similarity engine (the stuff Jeff King was working on), but the way the 
code is currently structured - with no incremental rename detection at 
all, and with all the scoring in one global table - it's pretty painful.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux