On Tue, Sep 11, 2007 at 12:00:33PM +0100, Richard W.M. Jones wrote: > Daniel Veillard wrote: > >On Tue, Sep 11, 2007 at 11:35:46AM +0200, Gerd Hoffmann wrote: > >>Daniel Veillard wrote: > >>> - the ssh process dies > >>> - libvirt based application takes some time to notice it > >>> - the OS span a new process with the same PID after a PID rollabck > >>Can not happen as long as libvirt hasn't asked for the exist status via > >>waitpid() because the pid is still in use by the zombie ssh process. > > > > Hum, which is precisely why we need the patch. Still I would feel a bit > >better if we could check that priv->pid is a child of the current process > >something like (getppid(priv->pid) == getpid()) test before any kill would > >do this easilly I think. > > I think Gerd's point was that as long as we haven't waited for the PID > within this process before, the PID cannot be reused. AFAIK there is no API to give you the parent PID of an arbitrary PID. The getppid() call returns your own parent - you can't ask it for someone else's parent. > That doesn't mean the situation cannot arise -- for example the main > program might be using other libraries as well as libvirt, and those > other libraries might blindly wait(2) for children. There is an issue if the app has set SIGCHLD to SIG_IGN - the kernel will automatically reap zombies then. This would allow the race that Daniel illustrates above, where we might 'kill' a program that is no longer our own SSH client. In the case of cleaning up after a failed doRemoteOpen call we should be safe enough, since we only spawned the SSH process 10 lines higher up and the system was have to be insanely busy to cycle through 65536 PIDs before we completed those 10 lines. In the case of doRemoteClose we've not got alot of good options. Either take the risk that SIGCHILD is SIG_IGN or someone else called wait() and do the kill() anyway. Another option is to double-fork() when running the SSH tunnel so it gets inherited by init. Then we assume that SSH will die & exit when we close the socket in doRemoteClose. ie closing our end of the socket should make SSH get a SIGPIPE / EOF and exit - or equivalently if the server closes its end. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list