[PATCH] ci: disallow directional formatting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Johannes Schindelin <johannes.schindelin@xxxxxx>

As described in https://trojansource.codes/trojan-source.pdf, it is
possible to abuse directional formatting (a feature of Unicode) to
deceive human readers into interpreting code differently from compilers.

It is highly unlikely that Git's source code wants to contain such
directional formatting in the first place, so let's disallow it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
---
    ci: disallow directional formatting
    
    I just stumbled over
    https://siliconangle.com/2021/11/01/trojan-source-technique-can-inject-malware-source-code-without-detection/,
    which details an interesting social-engineering attack: it uses
    directional formatting in source code to pretend to human readers that
    the code does something different than it actually does.
    
    It is highly unlikely that Git's source code wants to contain such
    directional formatting in the first place, so let's disallow it.
    
    Technically, this is not exactly -rc material, but the paper was just
    published, and I want us to be safe.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1071%2Fdscho%2Fcheck-for-utf-8-directional-formatting-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1071/dscho/check-for-utf-8-directional-formatting-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1071

 .github/workflows/main.yml | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 6ed6a9e8076..7b4b4df03c3 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -289,6 +289,13 @@ jobs:
     - uses: actions/checkout@v2
     - run: ci/install-dependencies.sh
     - run: ci/run-static-analysis.sh
+    - name: disallow Unicode directional formatting
+      run: |
+        # Use UTF-8-aware `printf` to feed a byte pattern to non-UTF-8-aware `git grep`
+        # (Ubuntu's `git grep` is compiled without support for libpcre, otherwise we
+        # could use `git grep -P` with the `\u` syntax).
+        ! LANG=C git grep -Il "$(LANG=C.UTF-8 printf \
+          '\\(\u202a\\|\u202b\\|\u202c\\|\u202d\\|\u202e\\|\u2066\\|\u2067\\|\u2068\\|\u2069\\)')"
   sparse:
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'

base-commit: 0cddd84c9f3e9c3d793ec93034ef679335f35e49
-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux