Re: [syzbot] [fs?] [mm?] KCSAN: data-race in bprm_execve / copy_fs (4)

Oleg Nesterov <oleg@xxxxxxxxxx> · Sat, 22 Mar 2025 16:55:39 +0100

Quite possibly I am wrong, I need to recall this code, but at first
glance...

On 03/22, Al Viro wrote:
>
> Not really.

I agree, it is really racy. But,

> 1) A enters check_unsafe_execve(), sets ->in_exec to 1
> 2) B enters check_unsafe_execve(), sets ->in_exec to 1

No, check_unsafe_execve() is called with cred_guard_mutex held,
see prepare_bprm_creds()

> 3) A calls exec_binprm(), fails (bad binary)
> 4) A clears ->in_exec

So (2) can only happen after A fails and drops cred_guard_mutex.

And this means that we just need to ensure that ->in_exec is cleared
before this mutex is dropped, no? Something like below?

Oleg.
---

diff --git a/fs/exec.c b/fs/exec.c
index 506cd411f4ac..f8bf3c96e181 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1233,6 +1233,7 @@ int begin_new_exec(struct linux_binprm * bprm)
 	 * Make this the only thread in the thread group.
 	 */
 	retval = de_thread(me);
+	current->fs->in_exec = 0;
 	if (retval)
 		goto out;
 
@@ -1497,6 +1498,8 @@ static void free_bprm(struct linux_binprm *bprm)
 	}
 	free_arg_pages(bprm);
 	if (bprm->cred) {
+		// for the case exec fails before de_thread()
+		current->fs->in_exec = 0;
 		mutex_unlock(&current->signal->cred_guard_mutex);
 		abort_creds(bprm->cred);
 	}
@@ -1862,7 +1865,6 @@ static int bprm_execve(struct linux_binprm *bprm)
 
 	sched_mm_cid_after_execve(current);
 	/* execve succeeded */
-	current->fs->in_exec = 0;
 	current->in_execve = 0;
 	rseq_execve(current);
 	user_events_execve(current);
@@ -1881,7 +1883,6 @@ static int bprm_execve(struct linux_binprm *bprm)
 		force_fatal_sig(SIGSEGV);
 
 	sched_mm_cid_after_execve(current);
-	current->fs->in_exec = 0;
 	current->in_execve = 0;
 
 	return retval;