Re: [PATCH 11/11] seccomp: Add tgid and tid into seccomp_data

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Tue, 29 Jul 2014 21:35:00 -0700

On Tue, Jul 29, 2014 at 9:08 PM, Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>
>> On Mon, Jul 28, 2014 at 2:18 PM, Eric W. Biederman
>> <ebiederm@xxxxxxxxxxxx> wrote:
>>> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>>>
>>>> [cc: Eric Biederman]
>>>>
>>>
>>>> Can we do one better and add a flag to prevent any non-self pid
>>>> lookups?  This might actually be easy on top of the pid namespace work
>>>> (e.g. we could change the way that find_task_by_vpid works).
>>>>
>>>> It's far from just being signals.  There's access_process_vm, ptrace,
>>>> all the signal functions, clock_gettime (see CPUCLOCK_PID -- yes, this
>>>> is ridiculous), and probably some others that I've forgotten about or
>>>> never noticed in the first place.
>>>
>>> So here is the practical question.
>>>
>>> Are these processes that only can send signals to their thread group
>>> allowed to call fork()?
>>>
>>>
>>> If fork is allowed and all pid lookups are restricted to their own
>>> thread group that wait, waitpid, and all of the rest of the wait family
>>> will never return the pids of their children, and zombies will
>>> accumulate.  Aka the semantics are fundamentally broken.
>>
>> Good point.
>>
>> I can imagine at least three ways that fork() could continue working, though:
>>
>> 1. Allow lookups of immediate children, too.  (I don't love this one.)
>> 2. Allow non-self pids to be translated in but not out.  This way
>> P_ALL will continue working.
>> 3. Have the kernel treat any PID-restricted process as though it were NOCLDWAIT.
>>
>> I think I like #3.  Thoughts?
>>
>>>
>>> If fork is not allowed pid namespaces already solve this problem.
>>
>> PID namespaces are fairly heavyweight.  Julien pointed out that using
>> PID namespaces requires a bunch of dummy PID 1 processes.
>
> Only if you can't tolerate init exiting.  The reasoning with respect to
> signals and signals being ignored was wrong.  And if you only have one
> process you care about and no children to worry about neither the
> difference in signal handling nor the world dies whe init exits applies.

Can you elaborate?  It seems entirely plausible to me that there are
programs that won't work right as PID 1 without considerable
adaptation.

>
> Therefore given what I have read described pid namespaces are a trivial
> solution to this problem space.

pid namespaces also won't work in the context of Capsicum unless you
want every single Capsicum process to be its own pid namespace.  Also,
pid namespaces don't offer any way to protect children from parents.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html