On Mon, 2018-08-27 at 13:47 -0400, Jeff Layton wrote: > POSIX mandates that open fds and their associated file locks should be > preserved across an execve. This works, unless the process is > multithreaded at the time that execve is called. > > In that case, we'll end up unsharing the files_struct but the locks will > still have their fl_owner set to the address of the old one. Eventually, > when the other threads die and the last reference to the old > files_struct is put, any POSIX locks get torn down since it looks like > a close occurred on them. > > The result is that all of your open files will be intact with none of > the locks you held before execve. The simple answer to this is "use OFD > locks", but this is a nasty surprise and it violates the spec. > > Fix this by doing unshare_files later during exec, after we've already > killed off the other threads in the process. This helps ensure that we > only unshare the files_struct during exec when it is truly shared with > other processes. > > Note that because the unshare_files call is now done just after > de_thread, we need a mechanism to pass the displaced files_struct back > up to __do_execve_file. This is done via a new displaced_files field > inside the linux_binprm. > > Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> > Reported-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > --- > fs/exec.c | 19 +++++++++---------- > include/linux/binfmts.h | 1 + > 2 files changed, 10 insertions(+), 10 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index ca25f805ebad..a45b0cae5817 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1262,6 +1262,10 @@ int flush_old_exec(struct linux_binprm * bprm) > if (retval) > goto out; > > + retval = unshare_files(&bprm->displaced_files); > + if (retval) > + goto out; > + > /* > * Must be called _before_ exec_mmap() as bprm->mm is > * not visibile until then. This also enables the update > @@ -1712,8 +1716,7 @@ static int __do_execve_file(int fd, struct filename *filename, > int flags, struct file *file) > { > char *pathbuf = NULL; > - struct linux_binprm *bprm; > - struct files_struct *displaced; > + struct linux_binprm *bprm = NULL; > int retval; > > if (IS_ERR(filename)) > @@ -1735,10 +1738,6 @@ static int __do_execve_file(int fd, struct filename *filename, > * further execve() calls fail. */ > current->flags &= ~PF_NPROC_EXCEEDED; > > - retval = unshare_files(&displaced); > - if (retval) > - goto out_ret; > - > retval = -ENOMEM; > bprm = kzalloc(sizeof(*bprm), GFP_KERNEL); > if (!bprm) > @@ -1831,8 +1830,8 @@ static int __do_execve_file(int fd, struct filename *filename, > kfree(pathbuf); > if (filename) > putname(filename); > - if (displaced) { > - put_files_struct(displaced); > + if (bprm->displaced_files) { > + put_files_struct(bprm->displaced_files); Note that this is broken (bprm is freed above this point). It's simple enough to fix, but I'll hold off on resending until I hear some feedback on the general approach. > } else { > spin_lock(¤t->files->file_lock); > current->files->in_exec = false; > @@ -1855,8 +1854,8 @@ static int __do_execve_file(int fd, struct filename *filename, > kfree(pathbuf); > > out_files: > - if (displaced) { > - reset_files_struct(displaced); > + if (bprm && bprm->displaced_files) { > + reset_files_struct(bprm->displaced_files); > } else { > spin_lock(¤t->files->file_lock); > current->files->in_exec = false; > diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h > index c05f24fac4f6..d7ec384bb1b0 100644 > --- a/include/linux/binfmts.h > +++ b/include/linux/binfmts.h > @@ -49,6 +49,7 @@ struct linux_binprm { > unsigned int taso:1; > #endif > unsigned int recursion_depth; /* only for search_binary_handler() */ > + struct files_struct * displaced_files; > struct file * file; > struct cred *cred; /* new credentials */ > int unsafe; /* how unsafe this exec is (mask of LSM_UNSAFE_*) */ -- Jeff Layton <jlayton@xxxxxxxxxx>