On Fri, Sep 07, 2018 at 10:34:09AM +0200, Andrej Shadura wrote: > From: Adam Borowski <kilobyte@xxxxxxxxxx> > Both "dash -c foo" and "./foo" are supposed to be able to run hashbang-less > scripts, but attempts to execute common binary files tend to be nasty: > especially both ELF and PE tend to make dash create a bunch of files with > unprintable names, that in turn confuse some tools up to causing data loss. > Thus, let's read the first line and see if it looks like text. This is a > variant of the approach used by bash and zsh; mksh instead checks for > signatures of a bunch of common file types. > POSIX says: "If the executable file is not a text file, the shell may bypass > this command execution." I think this is worth the extra code, even though POSIX does not require it. Neither the file_is_binary function nor its caller take care of preserving errno, which might lead to confusing error messages. On another note, this is not fully POSIX-compliant because it rejects some files that comply to POSIX's definition of a text file. Per POSIX's definition, only files containing '\0' can be rejected (or containing too long lines or not ending with a newline, both of which seem like a bad idea to check); this probably requires checking more than the first line for decent detection. However, I suppose the benefits and disadvantages were weighed here and non-compliance was decided. > Signed-off-by: Adam Borowski <kilobyte@xxxxxxxxxx> > Signed-off-by: Andrej Shadura <andrew.shadura@xxxxxxxxxxxxxxx> > --- > src/exec.c | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/src/exec.c b/src/exec.c > index 9d0215a..6300001 100644 > --- a/src/exec.c > +++ b/src/exec.c > @@ -148,6 +148,36 @@ shellexec(char **argv, const char *path, int idx) > } > > > +/* > + * Check if an executable that just failed with ENOEXEC shouldn't be > + * considered a script (wrong-arch ELF/PE, junk accidentally set +x, etc). > + * We check only the first line to allow binaries encapsulated in a shell > + * script without proper quoting. The first line, if not a hashbang, is > + * likely to contain comments; even ancient encodings, at least popular > + * ones, don't use 0x7f nor values below 0x1f other than whitespace (\t, > + * \n, \v, \f, \r), ISO/IEC 2022 can have SI, SO and \e. > + */ > +STATIC int file_is_binary(const char *cmd) > +{ > + char buf[128]; > + int fd = open(cmd, O_RDONLY|O_NOCTTY); > + if (fd == -1) > + return 1; > + int len = read(fd, buf, sizeof(buf)); > + close(fd); > + for (int i = 0; i < len; ++i) { > + char c = buf[i]; > + if (c >= 0 && c <= 8 || > + c >= 16 && c <= 31 && c != 27 || > + c == 0x7f) > + return 1; > + if (c == '\n') > + return 0; > + } > + return 0; > +} > + > + > STATIC void > tryexec(char *cmd, char **argv, char **envp) > { > @@ -162,6 +192,8 @@ repeat: > execve(cmd, argv, envp); > #endif > if (cmd != path_bshell && errno == ENOEXEC) { > + if (file_is_binary(cmd)) > + return; > *argv-- = cmd; > *argv = cmd = path_bshell; > goto repeat; -- Jilles Tjoelker