Re: [PATCH v3] ci: disallow directional formatting

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Tue, 09 Nov 2021 14:34:12 +0100

On Mon, Nov 08 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>
>>> So you are comparing
>>>
>>>  * requiring bash and C.UTF-8 locale to be available
>>>
>>> vs
>>>
>>>  * requiring git built with PCRE
>>>
>>> assuming that "Dscho says doesn't work with PCRE and you say it
>>> works with PCRE" is resolved?  They seem roughly the same
>>> difficulty to me.
>>
>> We can hard depend on a git built with PCRE, since the point of this
>> thing is to run in GitHub CI, Ubuntu builds git with PCRE, and that's
>> unlikely to ever change.
>
> Yes, so is the availability of bash and C.UTF-8 for the same reason:
> we are talking about controlled environment.  That is what I meant
> by "roughly the same difficulty to me".
>
> FWIW, I am OK with either approach, as I find the patch in question
> is just as readable as any rewrite that would use "grep -P", so...

To each his own I guess :) I do find the simple regex of:

    '[\N{U+202a}-\N{U+202e}\N{U+2066}-\N{U+2069}]'

Much easier to understand than something using printf, shell
interpolation, and needing to switch around LC_CTYPE to two different
values twice on one line.

But anyway, that's a matter of taste.

What isn't is the issue I noted at the bottom of [1], i.e. if we're
relying on '(attr:binary)' we should probably start with an assertion
that our idea of "binary" matches reality, or perhaps go back to the -I
heuristic.

Because now we've got at least one binary non-attribute-marked file, and
if files like that ever get updated they might start matching this
pattern. Maybe not a big deal, but someone updating those might
confusingly trip over this otherwise...

1. https://lore.kernel.org/git/211103.86zgqlhzvz.gmgdl@xxxxxxxxxxxxxxxxxxx/