On Mon, 24 Jul 2023 at 10:23, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > This means pidfd_getfd() needs the same treatment as MSG_PEEK for sockets. So the reason I think pidfd_getfd() is ok is that it has serialized with the source of the file descriptor and uses fget_task() -> __fget_files. And that code is nasty and complicated, but it does get_file_rcu() to increment the file count, and then *after* that it still checks that yes, the file pointer is still there. And that means that anybody who uses fget_task() will only ever get a ref to a file if that file still existed in the source, and you can never have a situation where a file comes back to life. The reason MSG_PEEK is special is exactly because it can "resurrect" a file that was closed, and added to the unix SCM garbage collection list as "only has a ref in the SCM thing", so when we then make it live again, it needs that very very subtle thing. So pidfd_getfd() is ok in this regard. But it *is* an example of how subtle it is to just get a new ref to an existing file. That whole if (atomic_read_acquire(&files->count) == 1) { in __fget_light() is also worth being aware of. It isn't about the file count, but it is about "I have exclusive access to the file table". So you can *not* close a file, or open a file, for another process from outside. The only thread that is allowed to access or change the file table (including resizing it), is the thread itself. I really hope we don't have cases of that. Linus