Re: [PATCH] grep: skip UTF8 checks explicitally

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

On Thu, 25 Jul 2019, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:
>
> >> OK, in short, barfing and stopping is a problem, but that flag is
> >> not the right knob to tweak.  And the right knob ...
> >>
> >> >  1) We're oversupplying PCRE2_UTF now, and one such case is what's being
> >> >     reported here. I.e. there's no reason I can think of for why a
> >> >     fixed-string pattern should need PCRE2_UTF set when not combined
> >> >     with --ignore-case. We can just not do that, but maybe I'm missing
> >> >     something there.
> >> >
> >> >  2) We can do "try utf8, and fallback". A more advanced version of this
> >> >     is what the new PCRE2_MATCH_INVALID_UTF flag (mentioned upthread)
> >> >     does. I was thinking something closer to just carrying two compiled
> >> >     patterns, and falling back on the ~PCRE2_UTF one if we get a
> >> >     PCRE2_ERROR_UTF8_* error.
> >>
> >> ... lies somewhere along that line.  I think that is very sensible.
> >
> > I am glad that everybody agrees with my original comment on ab/no-kwset
> > where I suggested that we should use our knowledge of the encoding of
> > the haystack and convert it to UTF-8 if we detect that the pattern is
> > UTF-8 encoded,...
>
> Please do not count me among "everybody", then.  I did not think
> that Ævar meant to iconv the haystack when I wrote the message you
> are responding to, but if that was what he meant, I would not have
> said "very sensible".

Okay, but in that case I cannot agree with your assessment that it is
very sensible.

If we're already deciding to paper over things, I'd much rather prefer
the simpler patch, i.e. Carlo's.

Ciao,
Dscho

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux