On Wed, 2019-02-20 at 14:26 +0100, Christian Brauner wrote: > On Wed, Feb 20, 2019 at 10:46:24AM +0800, Ian Kent wrote: > > On Fri, 2019-02-15 at 16:07 +0000, David Howells wrote: > > > Implement a kernel container object such that it contains the following > > > things: > > > > > > (1) Namespaces. > > > > > > (2) A root directory. > > > > > > (3) A set of processes, including one designated as the 'init' process. > > > > Yeah, I think a name other than init needs to be used for this > > process. > > > > The problem being that there is no requirement for container > > process 1 to behave in any way like an "init" process is > > expected to behave and that leads to confusion (at least > > it certainly did for me). > > If you look at the documentation for pid namespaces(7) you can see that > the pid 1 inside a pid namespace is expected to behave like an init > process: > - "The first process created in a new namespace [...] has the PID 1, > and is the "init" process for the namespace (see init(1))." > - "[...] child process that is orphaned within the namespace will be > reparented to this process rather than init(1) [...]" > - "If the "init" process of a PID namespace terminates, the kernel > terminates all of the processes in the namespace via a SIGKILL > signal. This behavior reflects the fact that the "init" process is > essential for the cor‐ rect operation of a PID namespace." > - "Only signals for which the "init" process has established a signal > handler can be sent to the "init" process by other members of the > PID namespace." > - "[...] the reboot(2) system call causes a signal to be sent to the > namespace "init" process." > > This is one of the reasons why all major current container runtimes > finally after years of failing to realize this run a stub init process > that mimicks a dumb init. Sure, you get away with not having an init > that behaves like an init but this is inherently broken or at least > against the way pid namespaces were designed. TBH I wasn't sure why the signal I sent didn't arrive, AFAICS it should have regardless of what signals the container init process was accepting. But it could have been due to a different problem in my kernel code (that's very likely). In any case it wasn't worth perusing because even if I did work it out I had already found that the request_key sub-system wasn't playing well with others when trying to run something within a container's namespaces, so no point in going further ... Ian