Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Steve Grubb kirjoitti 23.9.2021 klo 18.15:
On Wednesday, September 22, 2021 5:34:17 PM EDT Miro Hrončok wrote:
 From all the scan that we've done on fullish installs in the past,
there's
only 2 others that you might run across: application/x-elc (lisp) and
application/x-java-applet.

Maybe you just build in logic to workaround these 3 types? application/
javascript is really the only one I can think of that is common.

Yeah, maybe we should just do that. However, that would not cleanup the
executable pngs.

They should be easy to identify, they start with 'image'. There's not many
types on a typical system. This is what I see in /usr on a system with 5000
packages installed:

application/gzip
application/javascript
application/json
application/octet-stream
application/vnd.ms-fontobject
application/x-bad-elf
application/x-executable
application/x-kdelnk
application/x-sharedlib
application/zip
audio/ogg
font/sfnt
image/gif
image/jpeg
image/png
image/vnd.microsoft.icon
text/html
text/plain
text/x-awk
text/x-c
text/x-gawk
text/x-lua
text/x-luatex
text/x-perl
text/x-python
text/x-ruby
text/x-shellscript
text/x-systemtap
text/x-tcl

You might just make a map since the list is not all that big. The biggest
issue is when you have things text/plain or application/octet-stream. That
means we don't know what it is.

What about keeping the "detect mime type" approach, then dividing the results into three categories?

1. Can be executable, if so, must have a shebang, which is mangled: text/* is already there, add application/javascript and possibly others as needed. 2. Cannot be executable, remote the executable bit if found: image/* would take care of the executable pngs, many more like application/json can be added as needed.
3. The rest: do nothing with these.

Maybe that would be good enough, even if the mime type detection uncertainty sets a limit on how precise it can be?

Keeping the mime type detection approach, but using less data (the first 8 bytes approach) does not sound good. If 'file' really works better that way, then there is something wrong with it.

As for the application/javascript type, there is an IETF proposal that, among other things, tries to deprecate that and de-deprecate text/javascript [1]. So, perhaps some day category 1 could be reasonably equated with text/* again.

[1]: https://datatracker.ietf.org/doc/draft-ietf-dispatch-javascript-mjs

Otto
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux