Re: [PATCH] grep: correctly identify utf-8 characters with \{b,w} in -P

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Carlo Marcelo Arenas Belón  <carenas@xxxxxxxxx> writes:

> When UTF is enabled for a PCRE match, the corresponding flags are
> added to the pcre2_compile() call, but PCRE2_UCP wasn't included.

Would the same performance concern as

https://discourse.julialang.org/t/regex-pcre2-and-the-pcre2-ucp-ucp-flag/10930

apply to us as well?



>  	if (!opt->ignore_locale && is_utf8_locale() && !literal)
> -		options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF);
> +		options |= (PCRE2_UTF | PCRE2_UCP | PCRE2_MATCH_INVALID_UTF);
>  
>  #ifndef GIT_PCRE2_VERSION_10_36_OR_HIGHER
>  	/* Work around https://bugs.exim.org/show_bug.cgi?id=2642 fixed in 10.36 */



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux