Re: [RFC/PATCH] Reduce cost of deletion in levenstein distance (4 -> 3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Sorry for the looong delay ]

Zbigniew Jędrzejewski-Szmek <zbyszek@xxxxxxxxx> writes:

> On 04/27/2012 10:58 AM, Matthieu Moy wrote:
>> Before this patch, a character deletion has the same cost as 2 swaps, or
>> 4 additions, so Git prefers suggesting a completely scrambled command
>> name to removing a character. For example, "git tags" suggests "stage",
>> but not "tag".
>> 
>> By setting the deletion cost to 3, we keep it higher than swaps or
>> additions, but prefer 1 deletion to 2 swaps. "git tags" now suggests
>> "tag" in addition to staged.
>
> Hi,
> looks sensible, but I wonder if the algorithm shouldn't be tweaked even
> further. I understand why 'tags' and 'stage' are similar,
> but if I say 'tagz', git proposes (with your change), both 'stage' and
> 'tag'. 'tag' is one deletion away, but 'stage' requires a deletion and a
> replacement, so should loose to 'tag', I think.

First, my patch is also an improvement here since it allows showing tags
(previously, it showed only stage). The idea for showing stage before
tag is that the cost of deletion is greater than the cost of insertion,
which corresponds to the hypothesis that it's more common to miss one
character when typing than typing too many. That's probably subjective,
but I think it makes sense.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]