On Thu, Aug 13, 2020 at 03:13:05PM +0100, Marc Zyngier wrote: > On Fri, 5 Jun 2020 09:00:12 -0700 > Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > Hi Kees, > > > The execve(2)/uselib(2) syscalls have always rejected non-regular > > files. Recently, it was noticed that a deadlock was introduced when trying > > to execute pipes, as the S_ISREG() test was happening too late. This was > > fixed in commit 73601ea5b7b1 ("fs/open.c: allow opening only regular files > > during execve()"), but it was added after inode_permission() had already > > run, which meant LSMs could see bogus attempts to execute non-regular > > files. > > > > Move the test into the other inode type checks (which already look > > for other pathological conditions[1]). Since there is no need to use > > FMODE_EXEC while we still have access to "acc_mode", also switch the > > test to MAY_EXEC. > > > > Also include a comment with the redundant S_ISREG() checks at the end of > > execve(2)/uselib(2) to note that they are present to avoid any mistakes. > > > > My notes on the call path, and related arguments, checks, etc: > > > > do_open_execat() > > struct open_flags open_exec_flags = { > > .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, > > .acc_mode = MAY_EXEC, > > ... > > do_filp_open(dfd, filename, open_flags) > > path_openat(nameidata, open_flags, flags) > > file = alloc_empty_file(open_flags, current_cred()); > > do_open(nameidata, file, open_flags) > > may_open(path, acc_mode, open_flag) > > /* new location of MAY_EXEC vs S_ISREG() test */ > > inode_permission(inode, MAY_OPEN | acc_mode) > > security_inode_permission(inode, acc_mode) > > vfs_open(path, file) > > do_dentry_open(file, path->dentry->d_inode, open) > > /* old location of FMODE_EXEC vs S_ISREG() test */ > > security_file_open(f) > > open() > > > > [1] https://lore.kernel.org/lkml/202006041910.9EF0C602@keescook/ > > > > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx> > > --- > > fs/exec.c | 14 ++++++++++++-- > > fs/namei.c | 6 ++++-- > > fs/open.c | 6 ------ > > 3 files changed, 16 insertions(+), 10 deletions(-) > > > > diff --git a/fs/exec.c b/fs/exec.c > > index 30735ce1dc0e..2b708629dcd6 100644 > > --- a/fs/exec.c > > +++ b/fs/exec.c > > @@ -139,8 +139,13 @@ SYSCALL_DEFINE1(uselib, const char __user *, library) > > if (IS_ERR(file)) > > goto out; > > > > + /* > > + * may_open() has already checked for this, so it should be > > + * impossible to trip now. But we need to be extra cautious > > + * and check again at the very end too. > > + */ > > error = -EACCES; > > - if (!S_ISREG(file_inode(file)->i_mode)) > > + if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode))) > > goto exit; > > > > if (path_noexec(&file->f_path)) > > @@ -860,8 +865,13 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags) > > if (IS_ERR(file)) > > goto out; > > > > + /* > > + * may_open() has already checked for this, so it should be > > + * impossible to trip now. But we need to be extra cautious > > + * and check again at the very end too. > > + */ > > err = -EACCES; > > - if (!S_ISREG(file_inode(file)->i_mode)) > > + if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode))) > > goto exit; > > > > if (path_noexec(&file->f_path)) > > diff --git a/fs/namei.c b/fs/namei.c > > index a320371899cf..0a759b68d66e 100644 > > --- a/fs/namei.c > > +++ b/fs/namei.c > > @@ -2835,16 +2835,18 @@ static int may_open(const struct path *path, int acc_mode, int flag) > > case S_IFLNK: > > return -ELOOP; > > case S_IFDIR: > > - if (acc_mode & MAY_WRITE) > > + if (acc_mode & (MAY_WRITE | MAY_EXEC)) > > return -EISDIR; > > This seems to change (break?) the behaviour of syscalls such as execv, > which can now return -EISDIR, whereas the existing behaviour was to > return -EACCES. The man page never hints at the possibility of -EISDIR > being returned, making it feel like a regression. > > POSIX (FWIW) also says: > > <quote> > [EACCES] > The new process image file is not a regular file and the > implementation does not support execution of files of its type. > </quote> > > This has been picked up by the Bionic test suite[1], but can just as > easily be reproduced with the following snippet: > > $ cat x.c > #include <unistd.h> > #include <stdio.h> > int main(int argc, char *argv[]) > { > execv("/", NULL); > perror("execv"); > return 0; > } > > Before this patch: > $ ./x > execv: Permission denied > > After this patch: > $ ./x > execv: Is a directory That's a good point, yes. I will submit a fix for this. -- Kees Cook