On Wed, Oct 09, 2024 at 08:41:31AM -0600, Tycho Andersen wrote: > On Wed, Oct 02, 2024 at 02:34:43PM +0000, Zbigniew Jędrzejewski-Szmek wrote: > > On Tue, Sep 24, 2024 at 12:39:35PM -0500, Eric W. Biederman wrote: > > > Tycho Andersen <tycho@tycho.pizza> writes: > > > > > > > From: Tycho Andersen <tandersen@xxxxxxxxxxx> > > > > > > > > Zbigniew mentioned at Linux Plumber's that systemd is interested in > > > > switching to execveat() for service execution, but can't, because the > > > > contents of /proc/pid/comm are the file descriptor which was used, > > > > instead of the path to the binary. This makes the output of tools like > > > > top and ps useless, especially in a world where most fds are opened > > > > CLOEXEC so the number is truly meaningless. > > > > > > > > This patch adds an AT_ flag to fix up /proc/pid/comm to instead be the > > > > contents of argv[0], instead of the fdno. > > > > I tried this version (with a local modification to drop the flag and > > enable the new codepath if get_user_arg_ptr(argv, 0) returns nonnull > > as suggested later in the thread), and it seems to work as expected. > > In particular, 'pgrep' finds for the original name in case of > > symlinks. > > Here is a version that only affects /proc/pid/comm, without a flag. We > still have to do the dance of keeping the user argv0 before actually > doing __set_task_comm(), since we want to surface the resulting fault > if people pass a bad argv0. Thoughts? > > Tycho > > > > diff --git a/fs/exec.c b/fs/exec.c > index dad402d55681..61de8a71f316 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1416,7 +1416,16 @@ int begin_new_exec(struct linux_binprm * bprm) > set_dumpable(current->mm, SUID_DUMP_USER); > > perf_event_exec(); > - __set_task_comm(me, kbasename(bprm->filename), true); > + > + /* > + * If fdpath was set, execveat() made up a path that will > + * probably not be useful to admins running ps or similar. > + * Let's fix it up to be something reasonable. > + */ > + if (bprm->argv0) > + __set_task_comm(me, kbasename(bprm->argv0), true); > + else > + __set_task_comm(me, kbasename(bprm->filename), true); This isn't checking fdpath? > > /* An exec changes our domain. We are no longer part of the thread > group */ > @@ -1566,9 +1575,30 @@ static void free_bprm(struct linux_binprm *bprm) > if (bprm->interp != bprm->filename) > kfree(bprm->interp); > kfree(bprm->fdpath); > + kfree(bprm->argv0); > kfree(bprm); > } > > +static int bprm_add_fixup_comm(struct linux_binprm *bprm, struct user_arg_ptr argv) > +{ > + const char __user *p = get_user_arg_ptr(argv, 0); > + > + /* > + * In keeping with the logic in do_execveat_common(), we say p == NULL > + * => "" for comm. > + */ > + if (!p) { > + bprm->argv0 = kstrdup("", GFP_KERNEL); > + return 0; > + } > + > + bprm->argv0 = strndup_user(p, MAX_ARG_STRLEN); > + if (bprm->argv0) > + return 0; > + > + return -EFAULT; > +} I'd rather this logic got done in copy_strings() and to avoid duplicating a copy for all exec users. I think it should be possible to just do this, to find the __user char *: diff --git a/fs/exec.c b/fs/exec.c index 77364806b48d..e12fd706f577 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -642,6 +642,8 @@ static int copy_strings(int argc, struct user_arg_ptr argv, goto out; } } + if (argc == 0) + bprm->argv0 = str; } ret = 0; out: Once we get to begin_new_exec(), only if we need to do the work (fdpath set), then we can do the strndup_user() instead of making every exec hold a copy regardless of whether it will be needed. -Kees > + > static struct linux_binprm *alloc_bprm(int fd, struct filename *filename, int flags) > { > struct linux_binprm *bprm; > @@ -1975,6 +2005,10 @@ static int do_execveat_common(int fd, struct filename *filename, > goto out_ret; > } > > + retval = bprm_add_fixup_comm(bprm, argv); > + if (retval != 0) > + goto out_free; > + > retval = count(argv, MAX_ARG_STRINGS); > if (retval == 0) > pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n", > diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h > index e6c00e860951..0cd1f2d0e8c6 100644 > --- a/include/linux/binfmts.h > +++ b/include/linux/binfmts.h > @@ -55,6 +55,7 @@ struct linux_binprm { > of the time same as filename, but could be > different for binfmt_{misc,script} */ > const char *fdpath; /* generated filename for execveat */ > + const char *argv0; /* argv0 from execveat */ > unsigned interp_flags; > int execfd; /* File descriptor of the executable */ > unsigned long loader, exec; -- Kees Cook