Re: Git grep does not support multi-byte characters (like UTF-8)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Duy Nguyen <pclouds@xxxxxxxxx> writes:

> On top of this, pickaxe already supports icase even kws is used. But
> it only works for ascii, so either we fix it and support non-ascii, or
> we remove icase support entirely from diffcore_pickaxe(). I vote the
> former.

I think that is a different issue.  The pickaxe has a single very
narrowly-defined intended use case [*1*] and I do not care too much
how any use that is outside the intended use case behaves.  As long
as its intended use case does not suffer (1) correctness-wise, (2)
performance-wise and (3) code-cleanliness-wise, due to changes to
support such enhancements, I am perfectly fine.

Ascii-only icase match is one example of a feature that is outside
the intended use case, and implementation of it using kws is nearly
free if I am not mistaken, not making the primary use case suffer in
any way.

I however am highly skeptical that the same thing can be done with
non-ascii icase.  As long as it can be added without makinng the
primary use case suffer in any way, I do not mind it very much.

Thanks.


[Footnote]

*1* The requirement is very simple.  You get a string that is unique
in a blob that exists at the revision your traversal begins, and you
want to find the point where the blob at the corresponding path does
not have that exact string with minimal effort.  You do not need to
ensure that the input string is unique (it is a user error and the
behaviour is undefined) and for simplicity you are also allowed to
fire when the blob has more than one copies of the string (even
though the expected use is to find the place where the blob has
zero).

Any other cases, e.g. the string was not unique in the blob, the
user specified "ignore-case" and other irrelevant options, are
allowed to be incorrect or slow or both, as $gmane/217 does not need
such uses to implement it ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]