Re: [PATCH] Reduce cost of deletion in levenstein distance (4 -> 3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthieu Moy <Matthieu.Moy@xxxxxxx> writes:

> Before this patch, a character deletion has the same cost as 2 swaps, or
> 4 additions, so Git prefers suggesting a completely scrambled command
> name to removing a character. For example, "git tags" suggests "stage",
> but not "tag".
>
> By setting the deletion cost to 3, we keep it higher than swaps or
> additions, but prefer 1 deletion to 2 swaps. "git tags" now suggests
> "tag" in addition to staged.
>
> Signed-off-by: Matthieu Moy <Matthieu.Moy@xxxxxxx>
> ---
> The RFC sent earlier [1] didn't receive negative comments, so I think this
> is a good change.
>
> http://thread.gmane.org/gmane.comp.version-control.git/196457

Lack of objections is never a good reason to assume it is a good
change.  In this particular case, I think the reason why you saw
no comments, either positive or negative, was because nobody came up
with a more "scientific" way to judge the weighting.  Choosing
between "tag" and "stage" given "tags" is just one datapoint, but
does not convince anybody there are not surprising combinations
where this change affects negatively.

I wonder if we can mechanically find a set of parameters that
optimally separates the built-in commands by computing N^2 distances
(e.g. compute "tag" and all other command names and record
minimum. Do so for all other commands. Now fudge the parameters and
repeat to see if it results in better minimum separation.  Something
like that).

Having said all that, until somebody comes up with a better method
of judging, I'd say that the best thing we could do is to apply this
patch and see if anybody finds a "surprising" case where this leads
to a regression.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]