On Wed, Oct 05, 2022 at 08:06:15PM -0700, Kees Cook wrote: > Dave, this tracks back to commit a6f76f23d297 ("CRED: Make execve() take > advantage of copy-on-write credentials") ... any ideas what's happening > here? Er, rather, it originates before git history, but moved under lock in commit 0bf2f3aec547 ("CRED: Fix SUID exec regression"). Eric, Al, Hugh, does this ring a bell? It originates from 1da177e4c3f4 ("Linux-2.6.12-rc2") in git... static inline int unsafe_exec(struct task_struct *p) { int unsafe = 0; ... if (atomic_read(&p->fs->count) > 1 || atomic_read(&p->files->count) > 1 || atomic_read(&p->sighand->count) > 1) unsafe |= LSM_UNSAFE_SHARE; return unsafe; } Current code is: static void check_unsafe_exec(struct linux_binprm *bprm) { struct task_struct *p = current, *t; unsigned n_fs; ... t = p; n_fs = 1; spin_lock(&p->fs->lock); rcu_read_lock(); while_each_thread(p, t) { if (t->fs == p->fs) n_fs++; } rcu_read_unlock(); if (p->fs->users > n_fs) bprm->unsafe |= LSM_UNSAFE_SHARE; else p->fs->in_exec = 1; spin_unlock(&p->fs->lock); } Which seemed to take its form from: 0bf2f3aec547 ("CRED: Fix SUID exec regression") Quoting the rationale for the checks: ... moved the place in which the 'safeness' of a SUID/SGID exec was performed to before de_thread() was called. This means that LSM_UNSAFE_SHARE is now calculated incorrectly. This flag is set if any of the usage counts for fs_struct, files_struct and sighand_struct are greater than 1 at the time the determination is made. All of which are true for threads created by the pthread library. So, instead, we count up the number of threads (CLONE_THREAD) that are sharing our fs_struct (CLONE_FS), files_struct (CLONE_FILES) and sighand_structs (CLONE_SIGHAND/CLONE_THREAD) with us. These will be killed by de_thread() and so can be discounted by check_unsafe_exec(). So, I think this is verifying that when attempting a suid exec, there is no process out there with our fs_struct, file_struct, or sighand_struct that would survive the de_thread() and be able to muck with the suid's shared environment: if (atomic_read(&p->fs->count) > n_fs || atomic_read(&p->files->count) > n_files || atomic_read(&p->sighand->count) > n_sighand) Current code has eliminated the n_files and n_sighand tests: n_sighand was removed by commit f1191b50ec11 ("check_unsafe_exec() doesn't care about signal handlers sharing") n_files was removed by commit e426b64c412a ("fix setuid sometimes doesn't") The latter reads very much like the current bug report. :) So likely the n_fs test is buggy too... After de_thread(), I see the calls to unshare_sighand() and unshare_files(), so those check out. What's needed to make p->fs safe? Doing an unshare of it seems possible, since it exists half as a helper, unshare_fs(), and half open-coded in ksys_unshare (see "new_fw"). Should we wire this up after de_thread() like the other two? -Kees -- Kees Cook