On Thu, Nov 04 2021, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >>> +# U+202a..U+2a2e: LRE, RLE, PDF, LRO and RLO >>> +# U+2066..U+2069: LRI, RLI, FSI and PDI >>> +regex='(\u202a|\u202b|\u202c|\u202d|\u202e|\u2066|\u2067|\u2068|\u2069)' >>> + >>> +! LC_CTYPE=C git grep -El "$(LC_CTYPE=C.UTF-8 printf "$regex")" \ >>> + -- ':(exclude,attr:binary)' ':(exclude)*.po' >> ... >> ! git -P grep -nP '[\N{U+202a}-\N{U+202e}\N{U+2066}-\N{U+2069}]' ':!(attr:binary)' > > So you are comparing > > * requiring bash and C.UTF-8 locale to be available > > vs > > * requiring git built with PCRE > > assuming that "Dscho says doesn't work with PCRE and you say it > works with PCRE" is resolved? They seem roughly the same > difficulty to me. We can hard depend on a git built with PCRE, since the point of this thing is to run in GitHub CI, Ubuntu builds git with PCRE, and that's unlikely to ever change. The caveats around PCRE that still somewhat persist around PCRE are due to a misunderstanding, i.e. I think Johannes tried the \uXXXX syntax, which we don't opt git-grep into, but as shown above you can just use the other universally supported PCRE syntax for referring to Unicode codepoints.