Hi Junio, On Thu, 25 Jul 2019, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > > >> OK, in short, barfing and stopping is a problem, but that flag is > >> not the right knob to tweak. And the right knob ... > >> > >> > 1) We're oversupplying PCRE2_UTF now, and one such case is what's being > >> > reported here. I.e. there's no reason I can think of for why a > >> > fixed-string pattern should need PCRE2_UTF set when not combined > >> > with --ignore-case. We can just not do that, but maybe I'm missing > >> > something there. > >> > > >> > 2) We can do "try utf8, and fallback". A more advanced version of this > >> > is what the new PCRE2_MATCH_INVALID_UTF flag (mentioned upthread) > >> > does. I was thinking something closer to just carrying two compiled > >> > patterns, and falling back on the ~PCRE2_UTF one if we get a > >> > PCRE2_ERROR_UTF8_* error. > >> > >> ... lies somewhere along that line. I think that is very sensible. > > > > I am glad that everybody agrees with my original comment on ab/no-kwset > > where I suggested that we should use our knowledge of the encoding of > > the haystack and convert it to UTF-8 if we detect that the pattern is > > UTF-8 encoded,... > > Please do not count me among "everybody", then. I did not think > that Ævar meant to iconv the haystack when I wrote the message you > are responding to, but if that was what he meant, I would not have > said "very sensible". Okay, but in that case I cannot agree with your assessment that it is very sensible. If we're already deciding to paper over things, I'd much rather prefer the simpler patch, i.e. Carlo's. Ciao, Dscho