Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes: > On 3/3/20 9:08 PM, Eric W. Biederman wrote: >> Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes: >> >>> On 3/3/20 4:18 PM, Eric W. Biederman wrote: >>>> Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes: >>>>> diff --git a/tools/testing/selftests/ptrace/vmaccess.c b/tools/testing/selftests/ptrace/vmaccess.c >>>>> new file mode 100644 >>>>> index 0000000..6d8a048 >>>>> --- /dev/null >>>>> +++ b/tools/testing/selftests/ptrace/vmaccess.c >>>>> @@ -0,0 +1,66 @@ >>>>> +// SPDX-License-Identifier: GPL-2.0+ >>>>> +/* >>>>> + * Copyright (c) 2020 Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> >>>>> + * All rights reserved. >>>>> + * >>>>> + * Check whether /proc/$pid/mem can be accessed without causing deadlocks >>>>> + * when de_thread is blocked with ->cred_guard_mutex held. >>>>> + */ >>>>> + >>>>> +#include "../kselftest_harness.h" >>>>> +#include <stdio.h> >>>>> +#include <fcntl.h> >>>>> +#include <pthread.h> >>>>> +#include <signal.h> >>>>> +#include <unistd.h> >>>>> +#include <sys/ptrace.h> >>>>> + >>>>> +static void *thread(void *arg) >>>>> +{ >>>>> + ptrace(PTRACE_TRACEME, 0, 0L, 0L); >>>>> + return NULL; >>>>> +} >>>>> + >>>>> +TEST(vmaccess) >>>>> +{ >>>>> + int f, pid = fork(); >>>>> + char mm[64]; >>>>> + >>>>> + if (!pid) { >>>>> + pthread_t pt; >>>>> + >>>>> + pthread_create(&pt, NULL, thread, NULL); >>>>> + pthread_join(pt, NULL); >>>>> + execlp("true", "true", NULL); >>>>> + } >>>>> + >>>>> + sleep(1); >>>>> + sprintf(mm, "/proc/%d/mem", pid); >>>>> + f = open(mm, O_RDONLY); >>>>> + ASSERT_LE(0, f); >>>>> + close(f); >>>>> + f = kill(pid, SIGCONT); >>>>> + ASSERT_EQ(0, f); >>>>> +} >>>>> + >>>>> +TEST(attach) >>>>> +{ >>>>> + int f, pid = fork(); >>>>> + >>>>> + if (!pid) { >>>>> + pthread_t pt; >>>>> + >>>>> + pthread_create(&pt, NULL, thread, NULL); >>>>> + pthread_join(pt, NULL); >>>>> + execlp("true", "true", NULL); >>>>> + } >>>>> + >>>>> + sleep(1); >>>>> + f = ptrace(PTRACE_ATTACH, pid, 0L, 0L); >>>> >>>> To be meaningful this code needs to learn to loop when >>>> ptrace returns -EAGAIN. >>>> >>>> Because that is pretty much what any self respecting user space >>>> process will do. >>>> >>>> At which point I am not certain we can say that the behavior has >>>> sufficiently improved not to be a deadlock. >>>> >>> >>> In this special dead-duck test it won't work, but it would >>> still be lots more transparent what is going on, since previously >>> you had two zombie process, and no way to even output debug >>> messages, which also all self respecting user space processes >>> should do. >> >> Agreed it is more transparent. So if you are going to deadlock >> it is better. >> >> My previous proposal (which I admit is more work to implement) would >> actually allow succeeding in this case and so it would not be subject to >> a dead lock (even via -EGAIN) at this point. >> >>> So yes, I can at least give a good example and re-try it several >>> times together with wait4 which a tracer is expected to do. >> >> Thank you, >> >> Eric >> > > Okay, I think it can be done with minimal API changes, > but it needs two mutexes, one that guards the execve, > and one that guards only the credentials. > > If no traced sibling thread exists, the mutexes are used this way: > lock(exec_guard_mutex) > cred_locked_in_execve = true; > de_thread() > lock(cred_guard_mutex) > unlock(cred_guard_mutex) > cred_locked_in_execve = false; > unlock(exec_guard_mutex) > > so effectively no API change at all. > > If a traced sibling thread exists, the mutexes are used differently: > lock(exec_guard_mutex) > cred_locked_in_execve = true; > unlock(exec_guard_mutex) > de_thread() > lock(cred_guard_mutex) > unlock(cred_guard_mutex) > lock(exec_guard_mutex) > cred_locked_in_execve = false; > unlock(exec_guard_mutex) > > Only the case changes that would deadlock anyway. Let me propose a slight alternative that I think sets us up for long term success. Leave cred_guard_mutex as is, but declare it undesirable. The cred_guard_mutex as designed really is something we should get rid of. As it it can sleep over several different userspace accesses. The copying of the exec arguments is technically as prone to deadlock as the ptrace case. Add a new mutex with a better name perhaps "exec_change_mutex" that is used to guard the changes that exec makes to a process. Then we gradually shift all the cred_guard_mutex users over to the new mutex. AKA one patch per user of cred_guard_mutex. At each patch that shifts things over we will have the opportunity to review the code to see that there no funny dependencies that were missed. I will sign up for working on the no_new_privs and ptrace_attach cases as I think I can make those happen. Especially no_new_privs. Getting the easier cases will resolve your issues and put things on a better footing. Eric