Re: [PATCH] setpgid.2, exit.3: document the lack of POSIX-specified behaviour inside PID NS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Dmitry V. Levin" <ldv@xxxxxxxxxxxx> writes:

> On Sun, Mar 10, 2019 at 11:59:26AM -0500, Eric W. Biederman wrote:
>> "Dmitry V. Levin" <ldv@xxxxxxxxxxxx> writes:
>> 
>> > On Thu, Mar 07, 2019 at 09:02:07PM +0100, Eugene Syromyatnikov wrote:
>> >> On Thu, Mar 7, 2019 at 8:05 PM Eugene Syromyatnikov <evgsyr@xxxxxxxxx> wrote:
>> >> >
>> >> > The POSIX-mandated behaviour of sending SIGCONT/SIGHUP to stopped processes
>> >> > of an orphaned process group is not observed inside PID namespaces, as
>> >> > can be verified by running [1] inside a PID namespace, for example.
>> >> >
>> >> > The derivation is (presumably) introduced by Linux commit
>> >> s/derivation/deviation/
>> >> 
>> >> > v2.6.24-rc1~237 ("pid namespaces: define is_global_init() and
>> >> > is_container_init()").
>> >> >
>> >> > [1] https://gitlab.com/strace/strace/commit/4278e6613f48273e7da0989712f1c18aaffefd84
>> >> >
>> >> > Reported-by: Dmitry V. Levin <ldv@xxxxxxxxxxxx>
>> >> > Signed-off-by: Eugene Syromyatnikov <evgsyr@xxxxxxxxx>
>> >> 
>> >> It should probably also be noted that the behaviour is also described
>> >> in TLPI, Section 34.8 ("Process groups, sessions, and job control:
>> >> Summary"), so it also likely has to be updated.
>> >
>> > Strictly speaking, whether orphaned process group semantics works in
>> > a PID namespace or not depends on the session ID.  If the session ID is
>> > the same as the session ID of init (which happens quite often in case
>> > of a PID namespace), then orphaned process group semantics doesn't work.
>> > If they differ, then the POSIX-mandated behaviour is supported.
>> 
>> 
>> http://pubs.opengroup.org/onlinepubs/9699919799 says:
>> > 3.264:  Orphaned Process Group
>> > 
>> > A process group in which the parent of every member is either itself
>> > a member of the process group or is not a member of the group's session.
>> 
>> It does not say anything about init.
>
> No, it doesn't say anything about init.
>
> I'm saying that the current linux behaviour is not conforming because
> the POSIX-mandated orphaned process group semantics is not implemented
> for the case when the session ID is the same as the session ID of init
> in the PID namespace.

The description below sounds like a real problem that breaks existing
software.

I don't see how I can say that code working exactly as specified by
POSIX is non-conforming.  I will definitely say that there is an issue.

>> Which makes the current version of orphaned process group handling posix
>> conformant.  By not ignoring the pid namespace init the code may not be
>> backwards compatible with the rest of linux.    Which may be a problem
>> worth addressing, either in the documentation or in the code.
>> 
>> It is not a break from posix.
>> 
>> Where is this behavior a problem?
>
> It is a problem in GitLab CI and whoever else uses docker-like
> containerization in a simple way.
>
> One of our complex strace tests passed everywhere except GitLab CI where
> it failed.  We were at a loss to find out why until we suspected the
> kernel and wrote a simple test for the orphaned process group semantics.
> When that simple test passed everywhere except GitLab CI where it failed,
> we suspected PID namespaces and reproduced the failure using "unshare
> -fp".

Thank you.  That description helps a lot.

At a minimum it sounds like we should document this case as a potential
problem and fix docker to not do that.

I am open to changes of behavior in the kernel but I want to make
certain they are well justified before I make anything so if possible
other regressions and complications are not introduced.

The intended semantics are that sessions and process groups can span
pid namespaces.  So I need to wrap my head around what makes what
happens in a pid namespace special that causes problems.  Is it the
reparenting to the pid namespace init?  Or do we just have a case where
the session is set up in a funny way the process group looks orphaned
from inside the process group but it does not actually act orphaned.

Hmm.  It looks like you have answered my questions with your test
program orphaned_process_group.  Is there source anywhere handy that I
can read it?

>> > For example:
>> >
>> > $ unshare -fprU sh -c './orphaned_process_group >/dev/null' && echo good || echo bad
>> > Orphaned process group semantics is not supported by the kernel
>> > bad
>> > $ unshare -fprU sh -c 'setsid ./orphaned_process_group >/dev/null' && echo good || echo bad
>> > good
>> >
>> > What can I say?  The very least that could be done to fix this is
>> > to replace is_global_init() invocation with is_container_init()
>> > in will_become_orphaned_pgrp() as suggested in
>> > https://lkml.org/lkml/2007/12/8/208

Thank you very much,
Eric Biederman




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux