Re: [PATCH 1/2] Implementation deficiency in virInitctlSetRunLevel v4

Eric Blake <eblake@xxxxxxxxxx> · Sat, 21 Dec 2013 00:14:21 -0700

On 12/20/2013 11:17 AM, Eric Blake wrote:
> On 12/20/2013 09:34 AM, Daniel P. Berrange wrote:
>> On Fri, Dec 20, 2013 at 08:24:41PM +0400, Reco wrote:
>>> Implement virProcessRunInMountNamespace, which runs callback of type
>>> virProcessNamespaceCallback in a container namespace.
>>>
>>> Hope it'll nail it this time.
> 
> Fails 'make syntax-check':
> 
> prohibit_fork_wrappers
> src/util/virprocess.c:874:    switch (cpid = fork()) {
> maint.mk: use virCommand for child processes
> make: *** [sc_prohibit_fork_wrappers] Error 1
> 
>>> +
>>> +    switch (cpid = fork()) {

Yuck.  Rewriting this to use virFork() has exposed some nastiness in our
existing library: 'virsh lxc-enter-namespace' and 'virt-login-shell'
both use virFork() incorrectly, which is a sign that the interface
itself is to blame.  The fact that a child process must explicitly call
_exit() if virFork() returns < 0 but set *pid to 0 is just nasty - it
violates the normal expectation that a negative return has no more work
to do.  I still plan on posting a revised version of this patch, but it
will be part of a bit bigger series that first tries to make virFork and
virProcessWait friendlier.

I spent some time learning about namespaces in the process - I can see
why 'virsh lxc-enter-namespace' must fork twice (although setns() can
join some namespaces on a per-thread basis, it cannot do so for the
'user namespace', so one fork is required to get to a single-threaded
state since virsh is multithreaded; the second fork is required because
setting the 'pid namespace' doesn't actually change the pid of the
current process, it only designates that child processes will be in the
new namespace).  But why is 'virt-login-shell' double-forking?  It is
already single-threaded, and doesn't do anything after fork except wait
on the child process, so a single fork after setting the 'pid namespace'
ought to be sufficient.

>>> +    case 0:
>>> +        if (setns(fd, 0) == -1) {
>>> +            _exit(-1);
> 
> exit(-1) is wrong; the user will see $? of 255 (not -1), and it is not
> our typical exit status for failure (where we normally want to be < 128
> so that it is not confused with exit due to signal).  Besides, I hate
> magic numbers; better would be _exit(EXIT_FAILURE).

Or maybe _exit(127), which is more idiomatic for failures when fork()
succeeds but exec() is not reached (see posix_spawn() for example).

> 
>>> +    default:
>>> +        if (virProcessWait(cpid, &status) < 0 || status < 0) {
> 
> No need to pass &status to virProcessWait() if you are going to insist
> on 0 status anyways; better to pass NULL and let virProcessWait do the
> validation on your behalf.

On the other hand, we may NEED to pass status on to the user, as I
noticed in patch 2 that your callback function wants to distinguish
between fatal errors and the simpler case of no initctl support;
collapsing all non-zero status into failure loses our chance to report
multiple bits of information.

> I'll repost a full v5 series with my proposed fixes to these issues in a
> separate thread, and wait for a review (since this fixes a CVE, it needs
> a second set of eyes).

That, and we still need the fix for virDomainDeviceAttach using /dev
only within the guest namespace.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list