Re: [PATCH] grep: disable lookahead on error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi René,

> On 20.10.2024, at 13:02, René Scharfe <l.s.r@xxxxxx> wrote:
> 
> regexec(3) can fail.  E.g. on macOS it fails if it is used with an UTF-8
> locale to match a valid regex against a buffer containing invalid UTF-8
> characters.
> 
> git grep has two ways to search for matches in a file: Either it splits
> its contents into lines and matches them separately, or it matches the
> whole content and figures out line boundaries later.  The latter is done
> by look_ahead() and it's quicker in the common case where most files
> don't contain a match.
> 
> Fall back to line-by-line matching if look_ahead() encounters an
> regexec(3) error by propagating errors out of patmatch() and bailing out
> of look_ahead() if there is one.  This way we at least can find matches
> in lines that contain only valid characters.  That matches the behavior
> of grep(1) on macOS.
> 
> pcre2match() dies if pcre2_jit_match() or pcre2_match() fail, but since
> we use the flag PCRE2_MATCH_INVALID_UTF it handles invalid UTF-8
> characters gracefully.  So implement the fall-back only for regexec(3)
> and leave the PCRE2 matching unchanged.
> 
> Reported-by: David Gstir <david@xxxxxxxxxxxxx>
> Signed-off-by: René Scharfe <l.s.r@xxxxxx>

thanks for fixing this! I’ve tested it on my end and your patch works. Feel free to add my Tested-By.

Thanks,
David




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux