Re: [PATCH] ci: disallow directional formatting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 02, 2021 at 04:01:57PM +0100, Ævar Arnfjörð Bjarmason wrote:
> There's a parallel discussion about doing something to detect this in
> "git am", which for the git project seems like a better place to put
> this.

I don't think that one impacts the other necessarily. Having `git am`
guard against this would probably be sufficient to protect Junio
accidentally apply something containing directional formatting to his
tree unknowingly.

But the idea that we rely on the import mechanism to protect against
this doesn't sit well with me. Ultimately, we should be relying on a
static check like below to ensure that directional formatting hasn't
entered the tree by any mechanism (not just 'git am').

> > diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> > index 6ed6a9e8076..7b4b4df03c3 100644
> > --- a/.github/workflows/main.yml
> > +++ b/.github/workflows/main.yml
> > @@ -289,6 +289,13 @@ jobs:
> >      - uses: actions/checkout@v2
> >      - run: ci/install-dependencies.sh
> >      - run: ci/run-static-analysis.sh
> > +    - name: disallow Unicode directional formatting
> > +      run: |
> > +        # Use UTF-8-aware `printf` to feed a byte pattern to non-UTF-8-aware `git grep`
> > +        # (Ubuntu's `git grep` is compiled without support for libpcre, otherwise we
> > +        # could use `git grep -P` with the `\u` syntax).
> > +        ! LANG=C git grep -Il "$(LANG=C.UTF-8 printf \
> > +          '\\(\u202a\\|\u202b\\|\u202c\\|\u202d\\|\u202e\\|\u2066\\|\u2067\\|\u2068\\|\u2069\\)')"
> >    sparse:
> >      needs: ci-config
> >      if: needs.ci-config.outputs.enabled == 'yes'
> >
> > base-commit: 0cddd84c9f3e9c3d793ec93034ef679335f35e49
>
> It would be easier to maintain this if were added to
> ci/run-static-analysis.sh instead, where we have some similar tests, and
> if it lives there we could do away with the double-escaping, then it can
> also be run manually.
>
> Also, can't we just pipe "git ls-files -z" into "perl -0ne" here and use
> its unconditional support for e.g. unicode properties in regexes.

I agree that the double-escaping is ugly. I think that this would be
easier to maintain if it lived in ci/run-static-analysis.sh or its own
script like ci/check-directional-formatting.sh.

And yes, constructing a byte pattern is a little odd as well, but I
think that it's the best you can do if you're limited to running 'git
grep' without libpcre. I wondered if we could depend on perl being
around during CI, but as far as I know we can since install Perl modules
in ci/install-dependencies.sh and use Perl extensively through the test
suite.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux