Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 22, 2021 at 1:22 PM Miro Hrončok <mhroncok@xxxxxxxxxx> wrote:
>
> Hello,
>
> for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script
> that does the following:
>
>   1) Gets all executable files in the buildroot
>   2) Gets all "text" files from those
>   3a) Mangles shebangs that are "wrong"
>       (e.g. #!/usr/bin/env node -> #!/usr/bin/node)
>   3b) Removes executable bits from "text" files without shebangs
>
> The idea behind this is that all "text" files that are executable need a
> shebang and if they don't have it, something is wrong. OTOH files that are
> "binary" don't need it.
>
> I intentionally put the terms "text" and "binary" in quotation marks, as the
> definition is somewhat fuzzy. Up until now, the script did the detection by
> utilizing the file tool to get the MIME type. If the MIME type starts with
> text/, it considered the executable to be a text file.
>
> However, a bug [1] has been discovered. Some obvious text files, such as
> executable JavaScript scripts, are detected as application/ (e.g.
> application/javascript), and hence are not considered "text". If a JavaScript
> executable script has the #!/usr/bin/env node shebang, the brp-mangle-sehbangs
> script does not mangle it.
>
> One possible solution [2] to this problem is to limit the number of bytes the
> MIME detection reads. My experiments showed that limiting the number of bytes
> to 8 always recognizes JavaScript (and other scripting languages) files as
> text/plain and binary files as application/octet-stream. As a side effect, it
> might make the BRP script faster. However, I am not sure if this approach is
> deterministic enough.
>
> Another solution, suggested by Florian Weimer [3], is to not detect MIME type
> at all, but use eu-elfclassify instead. The idea is quite simple: If (and only
> if) the executable file is ELF [4], it does not require a shebang. Instead of
> some fragile idea about what files are text and what files are binary, this is
> quite deterministic. It allows mangling shebangs of executable ZIP files etc.
>
> I've drafted the eu-elfclassify solution in a pull request [5]. However, we
> have discovered that several non-elf binary formats in Fedora are possibly
> legitimately executable. E.g. .exe files (for mono or wine) or other formats
> registered with the kernel [6].
>
> We are presented with 3 possible actions:
>
> 1) Keep the script as it is, say the text/ MIME type limitation is how this BRP
> script was scoped. Affected packages would need to correct shebangs manually.
>
> 2) Limit the MIME type detection to 8 bytes and hope it will not yield
> incorrect results.
>
> 3) Use eu-elfclassify. Consider non-ELF executables without shebangs bogus and
> document this. Packages that are affected would need to opt-out.
>
> What do you think?

So, if I understand correctly, the problem is that right now there's
no *existing* tool that reliably detects if a given "executable" (has
mode +x) is an actual executable "script" with a valid shebang? That
sounds like a very well-defined (though narrow) problem, maybe a small
tool that's taylor-made for use within brp-mangle-shebangs would do
the trick?

Fabio
_______________________________________________
packaging mailing list -- packaging@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to packaging-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/packaging@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure




[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Forum]     [KDE Users]

  Powered by Linux