On Mon, Apr 29, 2019 at 5:39 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > ... uuuh, whoops. Turns out I don't know what I'm talking about. Well, apparently there's some odd libc issue accoprding to Florian, so there *might* be something to it. > Nevermind. For some reason I thought vfork() was just > CLONE_VFORK|SIGCHLD, but now I see I got that completely wrong. Well, inside the kernel, that's actually *very* close to what vfork() is: SYSCALL_DEFINE0(vfork) { return _do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, 0, 0, NULL, NULL, 0); } but that's just an internal implementation detail. It's a real vfork() and should act as the traditional BSD "share everything" without any address space copying. The CLONE_VFORK flag is what does the "wait for child to exit or execve" magic. Note that vfork() is "exciting" for the compiler in much the same way "setjmp/longjmp()" is, because of the shared stack use in the child and the parent. It is *very* easy to get this wrong and cause massive and subtle memory corruption issues because the parent returns to something that has been messed up by the child. That may be why some libc might end up just using "fork()", because it ends up avoiding bugs in user space. (In fact, if I recall correctly, the _reason_ we have an explicit 'vfork()' entry point rather than using clone() with magic parameters was that the lack of arguments meant that you didn't have to save/restore any registers in user space, which made the whole stack issue simpler. But it's been two decades, so my memory is bitrotting). Also, particularly if you have a big address space, vfork()+execve() can be quite a bit faster than fork()+execve(). Linux fork() is pretty efficient, but if you have gigabytes of VM space to copy, it's going to take time even if you do it fairly well. Linus