On Thu, Jan 12 2023, Jeff King wrote: > On Thu, Jan 12, 2023 at 06:13:13PM +0100, René Scharfe wrote: > >> > I'm not quite sure what you mean here by "non-greedy repetitions". >> > Something like: >> > >> > # prefer "foo bar" to "foo bar bar"; only matters for colorizing or >> > # --only-matching >> > git grep -E 'foo.*?bar' >> > >> > ? If so, then yeah, that changes the meaning of a bare "?" and people >> > might be surprised by it. >> >> Right. To be fair, question mark is a special character and you'd >> probably need to quote it anyway if you want to match a literal >> question mark. Otherwise I get: >> >> $ git grep -E 'foo.*?bar' >> fatal: command line, 'foo.*?bar': repetition-operator operand invalid > > This is on macOS, I assume? With glibc it seems to be quietly ignored: > > $ git grep -E -o 'foo.*?ba' .clang-format > .clang-format:foo, bar, ba > > So it is not treated literally (as it would be without -E). But nor does > it make the match non-greedy (otherwise it would have output "foo, ba", > as "git grep -P" does). > > So it does seem like all bets are off for what people can and should > expect here. Which isn't to say we should make things worse. I mostly > wondered if REG_ENHANCED might take us closer to what glibc was doing by > default, but it doesn't seem like it. There's a couple of ways out of this that I don't see in this thread: - Declare it not a problem: We have -G, -E and -P to map to BRE, ERE and PCRE. One view is to say the first two must match POSIX, another is tha whatever the platform thinks they should do is how they should act. Of course that makes git regex invocations "unportable", but it might be acceptable. People can always use PCRE if they want "portability". - Just mandate PCRE for Mac OS, then map -E to -P. We could do this with the pcre2convert() API and PCRE2_CONVERT_POSIX_EXTENDED flag, i.e. PCRE's own "translate this to ERE". But Perl/PCRE syntax is already a superset of ERE syntax, minus things like (*VERB), (?: etc., which people are unlikely to write without intending to get the PCRE semantics.