On Mon, Apr 1, 2019 at 3:13 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, Apr 1, 2019 at 2:58 PM Jonathan Kowalski <bl0pbl33p@xxxxxxxxx> wrote: > > > > You mention the race about learning the PID, PID being recycled, and > > pidfd_open getting the wrong reference. > > > > This exists with the /proc model to way. How do you propose to address this? > > Note that that race exists _regardless_ of any interfaces. > pidfd_open() has the same race: any time you have a pid, the lifetime > of it is only as long as the process existing. > > That's why we talked about the CLONE_PIDFD flag, which would return > the pidfd itself when creating a new process. That's one truly > race-free way to handle it. Yes. Returning a pidfd from clone seems like a simple and robust approach. > Or just do the fork(), and know that the pid won't be re-used until > you've done the wait() for it, and block SIGCHLD until you've done the > lookup. That doesn't work when some other thread is running a waitpid(-1) loop. I think it's important to create an interface that libraries can use without global coordination. > That said, in *practice*, you can probably use any of the racy "look > up pidfd using pid" models, as long as you just verify the end result > after you've opened it. > > That verification could be as simple as "look up the parent pid of the > pidfd I got", if you know you created it with fork() (and you can > obviously track what _other_ thread you yourself created, so you can > verify whether it is yours or not). > > For example, using "openat(pidfd, "status", ..)", but also by just > tracking what you've done waitpid() on (but you need to look out for > subtle races with another thread being in the process of doing so). > > Or you can just say that as long as you got the pidfd quickly after > the fork(), any pid wrapping attack is practically not possible even > if it might be racy in theory. I don't like ignoring races just because they're rare. The cost of complete race freedom for the process interface is low considering the work we're doing on pidfds anyway.