Re: [PATCH] help: always suggest common-cmds if prefix of cmd

Erik Faye-Lund <kusmabite@xxxxxxxxx> · Wed, 24 Nov 2010 21:20:45 +0100

On Wed, Nov 24, 2010 at 8:49 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Erik Faye-Lund <kusmabite@xxxxxxxxx> writes:
>
>> @@ -320,9 +321,16 @@ const char *help_unknown_cmd(const char *cmd)
>>       uniq(&main_cmds);
>>
>>       /* This reuses cmdname->len for similarity index */
>> +     for (i = 0; i < main_cmds.cnt; ++i) {
>> +             main_cmds.names[i]->len = 1 +
>>                       levenshtein(cmd, main_cmds.names[i]->name, 0, 2, 1, 4);
>> +             for (n = 0; n < ARRAY_SIZE(common_cmds); ++n) {
>> +                     if (!strcmp(main_cmds.names[i]->name,
>> +                         common_cmds[n].name) &&
>> +                         !prefixcmp(main_cmds.names[i]->name, cmd))
>> +                             main_cmds.names[i]->len = 0;
>> +             }
>> +     }
>
> So main_cmds.names[]->len (which is not "len" anymore at this point but is
> just a "score") gets levenshtein distance (i.e. a smaller number indicates
> cmd is more likely to be a typo of it), and in addition ->len == 0 is "it
> is prefix".  Overall, the smaller the score, the likelier the match.
>

Correct. This was already the case, though. I just reserved len = 0 as
"this is a prefix of the entered command".

>> @@ -330,9 +338,12 @@ const char *help_unknown_cmd(const char *cmd)
>>       if (!main_cmds.cnt)
>>               die ("Uh oh. Your system reports no Git commands at all.");
>>
>> -     best_similarity = main_cmds.names[0]->len;
>> -     n = 1;
>> -     while (n < main_cmds.cnt && best_similarity == main_cmds.names[n]->len)
>> +     n = 0;
>> +     do {
>> +             best_similarity = main_cmds.names[n++]->len;
>> +     } while (!best_similarity);
>
> At this point, main_cmds.names[] is sorted by the above score (smaller to
> larger), and first you skip all the "prefix" ones that score 0.
>
> This relies on the fact that there is at least one entry with non-zero
> score, which in practice is true, but without even a comment?  I feel
> dirty.
>

Ah, yes. This needs clearer code badly. I'll see what I can cook up.

> The score of the first non-prefix entry is in best_similarity and that
> entry is at main_cmds.names[n-1] at this point.  You haven't checked
> main_cmds.names[n] yet...
>
>> +     n++;
>
> ... but you increment n to skip that entry without even looking, and then
> go on to ...
>

This is a bug, thanks for spotting it. It was intended to be the same
as "i = 1" in the old version, but I didn't think about it being
increased in the previous loop as well, silly me.

>> +     while (n < main_cmds.cnt && best_similarity >= main_cmds.names[n]->len)
>>               ++n;
>
> You skip the entries with the same similarity as the closest typo,
> presumably to point n to the first entry that is irrelevant (i.e. 0 thru n
> but not including n are candidates).
>
> Your rewrite of the loop makes it very hard to read and spot bugs, I
> think.
>

Indeed. What about this intra-diff? Hopefully it's a bit clearer, as
it's closer to the original, just reusing the same logic for the new
similar loop... Also makes the final diff smaller, which is nice.

diff --git a/help.c b/help.c
index dc76a62..d02a019 100644
--- a/help.c
+++ b/help.c
@@ -339,11 +339,10 @@ const char *help_unknown_cmd(const char *cmd)
 		die ("Uh oh. Your system reports no Git commands at all.");

 	n = 0;
-	do {
-		best_similarity = main_cmds.names[n++]->len;
-	} while (!best_similarity);
-	n++;
-	while (n < main_cmds.cnt && best_similarity >= main_cmds.names[n]->len)
+	while (n < main_cmds.cnt && !main_cmds.names[n]->len)
+		++n;
+	best_similarity = main_cmds.names[n++]->len;
+	while (n < main_cmds.cnt && best_similarity == main_cmds.names[n]->len)
 		++n;
 	if (autocorrect && n == 1 && SIMILAR_ENOUGH(best_similarity)) {
 		const char *assumed = main_cmds.names[0]->name;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html