Re: [PATCH v13 3/3] grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 17, 2021 at 1:01 PM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
>
> PCRE2_UTF will also matter for literal patterns. Try to peel apart the
> two bytes in "é" and match them under -i with/without PCRE_UTF.

Is there a real use case for why someone would do that? and how is
that "literal" valid UTF to warrant setting PCRE2_UTF?
I would expect that someone including random bytes in the expression
is really more interested in binary matching anyway and the use of -i
with it probably should be an error.

Indeed I suspect the fact that pcre2_compile lets it through might be
a bug in PCRE2

$ pcre2test
PCRE2 version 10.39 2021-10-29
  re> /\303/utf,caseless
data> \303
 0: \x{c3}
data> é
No match

Carlo




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux