On Fri, Jan 10, 2025 at 06:43:08AM -0500, Jeff King wrote: > I'll stop digging on it for now (but adding Junio to the cc as the > author there). Probably it would have been faster just to start with a > debugger than to look through the history. ;) OK, my curiosity got the better of me. This fixes it: diff --git a/grep.c b/grep.c index 4e155ee9e6..9eac3dd95d 100644 --- a/grep.c +++ b/grep.c @@ -1470,10 +1470,12 @@ static int look_ahead(struct grep_opt *opt, hit = patmatch(p, bol, bol + *left_p, &m, 0); if (hit < 0) return -1; if (!hit || m.rm_so < 0 || m.rm_eo < 0) continue; + if (m.rm_so == *left_p) + continue; /* don't match nothing */ if (earliest < 0 || m.rm_so < earliest) earliest = m.rm_so; } if (earliest < 0) { but it is weird to me that patmatch() will match "^$" to the end of the buffer at all. It is just calling regexec_buf() behind the scenes, so I guess this is just a weird special case there, and may even depend on the regex implementation. If I pass "-P" to use pcre instead, the problem goes away even without my patch. If we skip look-ahead the problem also goes away. I'd have thought match_line() would have the same problem, but there we process line by line, and regexec_buf() never even sees the newline. So I guess the rationale is: some regexec implementations are weird about this special regex, and we should not trust their result with it on a whole buffer with newlines. -Peff