Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> writes: > On 09.03.2020 00:38, Eric W. Biederman wrote: >> >> The cred_guard_mutex is problematic. The cred_guard_mutex is held >> over the userspace accesses as the arguments from userspace are read. >> The cred_guard_mutex is held of PTRACE_EVENT_EXIT as the the other >> threads are killed. The cred_guard_mutex is held over >> "put_user(0, tsk->clear_child_tid)" in exit_mm(). >> >> Any of those can result in deadlock, as the cred_guard_mutex is held >> over a possible indefinite userspace waits for userspace. >> >> Add exec_update_mutex that is only held over exec updating process >> with the new contents of exec, so that code that needs not to be >> confused by exec changing the mm and the cred in ways that can not >> happen during ordinary execution of a process. >> >> The plan is to switch the users of cred_guard_mutex to >> exec_udpate_mutex one by one. This lets us move forward while still >> being careful and not introducing any regressions. >> >> Link: https://lore.kernel.org/lkml/20160921152946.GA24210@xxxxxxxxxxxxxx/ >> Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ >> Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@xxxxxxxxxx/ >> Link: https://lore.kernel.org/lkml/20160923095031.GA14923@xxxxxxxxxx/ >> Link: https://lore.kernel.org/lkml/20170213141452.GA30203@xxxxxxxxxx/ >> Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.") >> Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2") >> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> >> --- >> fs/exec.c | 9 +++++++++ >> include/linux/sched/signal.h | 9 ++++++++- >> init/init_task.c | 1 + >> kernel/fork.c | 1 + >> 4 files changed, 19 insertions(+), 1 deletion(-) >> >> diff --git a/fs/exec.c b/fs/exec.c >> index d820a7272a76..ffeebb1f167b 100644 >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -1014,6 +1014,7 @@ static int exec_mmap(struct mm_struct *mm) >> { >> struct task_struct *tsk; >> struct mm_struct *old_mm, *active_mm; >> + int ret; >> >> /* Notify parent that we're no longer interested in the old VM */ >> tsk = current; >> @@ -1034,6 +1035,11 @@ static int exec_mmap(struct mm_struct *mm) >> return -EINTR; >> } >> } >> + >> + ret = mutex_lock_killable(&tsk->signal->exec_update_mutex); >> + if (ret) >> + return ret; > > You missed old_mm->mmap_sem unlock. See here: Duh. Thank you. I actually need to switch the lock ordering here, and I haven't yet because my son was sick yesterday. Something like this. diff --git a/fs/exec.c b/fs/exec.c index 96f89401b4d1..03d50c27ec01 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1020,9 +1020,14 @@ static int exec_mmap(struct mm_struct *mm) tsk = current; old_mm = current->mm; exec_mm_release(tsk, old_mm); + if (old_mm) + sync_mm_rss(old_mm); + + ret = mutex_lock_killable(&tsk->signal->exec_update_mutex); + if (ret) + return ret; if (old_mm) { - sync_mm_rss(old_mm); /* * Make sure that if there is a core dump in progress * for the old mm, we get out and die instead of going @@ -1032,14 +1037,11 @@ static int exec_mmap(struct mm_struct *mm) down_read(&old_mm->mmap_sem); if (unlikely(old_mm->core_state)) { up_read(&old_mm->mmap_sem); + mutex_unlock(&tsk->signal->exec_update_mutex); return -EINTR; } } - ret = mutex_lock_killable(&tsk->signal->exec_update_mutex); - if (ret) - return ret; - task_lock(tsk); active_mm = tsk->active_mm; membarrier_exec_mmap(mm); > diff --git a/fs/exec.c b/fs/exec.c > index 47582cd97f86..d557bac3e862 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1063,8 +1063,11 @@ static int exec_mmap(struct mm_struct *mm) > } > > ret = mutex_lock_killable(&tsk->signal->exec_update_mutex); > - if (ret) > + if (ret) { > + if (old_mm) > + up_read(&old_mm->mmap_sem); > return ret; > + } > > task_lock(tsk); > active_mm = tsk->active_mm; Eric