Elliot Wolk <elliot.wolk@xxxxxxxxx> writes: > On 07/01/2014 10:57 AM, Junio C Hamano wrote: >> Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> writes: >> >>> I think it does, but based on filename suffix. E.g. here is a rename of >>> three empty files with a suffix. >>> >>> 3 files changed, 0 insertions(+), 0 deletions(-) >>> rename 1.a => 2.a (100%) >>> rename 1.b => 2.b (100%) >>> rename 1.c => 2.c (100%) >> This is not more than a chance. >> >> We tie-break rename source candidates that have the same content >> similarity score to a rename destination using "name similarity", >> whose implementation has been diffcore-rename.c::basename_same(), >> which scores 1 if `basename $src` and `basename $dst` are the same >> and 0 otherwise, i.e. from 1.a to a/1.a is judged to be a better >> rename than from 1.a to a/2.a but otherwise there is nothing that >> favors rename from 1.a to 2.a over 1.a to 2.b. > > thanks for the info! > then i suppose my bug is a petition to have name similarity instead > use a different statistical matching algorithm. [administrivia: please do not top-post on this list] I didn't think it through but my gut feeling is that we could change the name similarity score to be the length of the tail part that matches (e.g. 1.a to a/2.a that has the same two bytes at the tail is a better match than to a/2.b that does not share any tail, and to a/1.a that shares the three bytes at the tail is an even better match). Oh, and rename basename_same() to something else; currently it is only used as the "name similarity", and after such a change, it will stay to be "name similarity" but will not be asking "are basenames the same?" anymore. Hint, hint... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html