Hi Junio, On Wed, 3 Nov 2021, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" <gitgitgadget@xxxxxxxxx> > writes: > > > +# U+202a..U+2a2e: LRE, RLE, PDF, LRO and RLO > > +# U+2066..U+2069: LRI, RLI, FSI and PDI > > +regex='(\u202a|\u202b|\u202c|\u202d|\u202e|\u2066|\u2067|\u2068|\u2069)' > > + > > +! git ls-files -z ':(attr:!binary)' | > > +LC_CTYPE=C xargs -0r git grep -Ele "$(LC_CTYPE=C.UTF-8 printf "$regex")" -- > > One thing for the future, and one thing for the present. > > - Do some languages we would add to po/ hierarchy in the future > possibly want to use these sequences as legitimate contents? I mulled over that. And I think you're right. If a right-to-left translation needs to refer to, say, a `git` invocation, the part that shows the commandline surely would need to be guarded within directional formatting code points. We currently only have translated messages that read left-to-right, for example we lack Arabic and Hebrew translations. Those would be likely to contain such code points on purpose. I therefore added `:(exclude)*.po` to the command. > - Do we need ls-files? > > > For the latter, shouldn't the attribute-based pathspec work just > fine with "git grep"? i.e. > > git grep -l -E -e $pattern -- ':(exclude,attr:binary)' D'oh. You're right! I fixed it. Ciao, Dscho