Re: [PATCH v2 00/15] Make the user mode driver code a better citizen

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Thu, 2 Jul 2020 22:40:09 +0900

On 2020/07/02 22:08, Eric W. Biederman wrote:
> Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> writes:
> 
>> On 2020/06/30 21:29, Eric W. Biederman wrote:
>>> Hmm.  The wake up happens just of tgid->wait_pidfd happens just before
>>> release_task is called so there is a race.  As it is possible to wake
>>> up and then go back to sleep before pid_has_task becomes false.
>>
>> What is the reason we want to wait until pid_has_task() becomes false?
>>
>> - wait_event(tgid->wait_pidfd, !pid_has_task(tgid, PIDTYPE_TGID));
>> + while (!wait_event_timeout(tgid->wait_pidfd, !pid_has_task(tgid, PIDTYPE_TGID), 1));
> 
> So that it is safe to call bpfilter_umh_cleanup.  The previous code
> performed the wait by having a callback in do_exit.

But bpfilter_umh_cleanup() does only

	fput(info->pipe_to_umh);
	fput(info->pipe_from_umh);
	put_pid(info->tgid);
	info->tgid = NULL;

which is (I think) already safe regardless of the usermode process because
bpfilter_umh_cleanup() merely closes one side of two pipes used between
two processes and forgets about the usermode process.

> 
> It might be possible to call bpf_umh_cleanup early but I have not done
> that analysis.
> 
> To perform the test correctly what I have right now is:

Waiting for the termination of a SIGKILLed usermode process is not
such simple. If a usermode process was killed by the OOM killer, it
might take minutes for the killed process to reach do_exit() due to
invisible memory allocation dependency chain. Since the OOM killer
kicks the OOM reaper, and the OOM reaper forgets about the killed
process after one second if mmap_sem could not be held (in order to
avoid OOM deadlock), the OOM situation will be eventually solved; but
there is no guarantee that the killed process can reach do_exit()
in a short period.