On 14/11/2024 11:29, Christian Brauner wrote: >> Moudlo namespaces, the pid in fid->pid is the same one passed to pidfd_open(). >> In the root namespace, you could replace name_to_handle_at(...) with >> pidfd_open(fid->pid, 0) and get the same result (if both are successful, at least). >> >> The resulting pidfd points to the same struct pid. The only thing that should differ >> is whether PIDFD_THREAD is set in f->f_flags. > I see what you mean but then there's another problem afaict. > > Two cases: > > (1) @pidfd_thread_group = pidfd_open(1234, 0) > > The pidfd_open() will succeed if the struct pid that 1234 resolves > to is used as a thread-group leader. > > (2) @pidfd_thread = pidfd_open(5678, PIDFD_THREAD) > > The pidfd_open() will succeed even if the struct pid that 5678 > resolves to isn't used as a thread-group leader. > > The resulting struct file will be marked as being a thread pidfd by > raising O_EXCL. > > (1') If (1) is passed to name_to_handle_at() a pidfs file handle is > encoded for 1234. If later open_by_hande_at() is called then by > default a thread-group leader pidfd is created. This is fine > > (2') If (2) is passed to name_to_handle_at() a pidfs file handle is > encoded for 5678. If later open_by_handle_at() is called then a > thread-group leader pidfd will be created again. > > So in (2') the caller has managed to create a thread-group leader pidfd > even though the struct pid isn't used as a thread-group leader pidfd. > Consequently, that pidfd is useless when passed to any of the pidfd_*() > system calls. > > So basically, you need to verify that if O_EXCL isn't specified with > open_by_handle_at() that the struct pid that is resolved is used as a > thread-group leader and if not, refuse to create a pidfd. > > Am I making sense? Ah, I fully see what you mean now. I could implement pidfs_file_operations.open and check the flags there, but that runs into the issue of vfs_open resetting the flags afterwards so its entirely pointless. If PIDFD_THREAD wasn't in the set O_CREAT / O_EXCL / O_NOCTTY / O_TRUNC then this would be much easier, but alas; and its ABI now too. I guess the options are 1. Let an FS specify that it doesn't want O_EXCL cleared, but this is getting to be some gnarly VFS surgery, or 2. We just detect we're working on a pidfd early in open_by_handle_at and skip straight into dedicated logic. I know you suggested (2) earlier and I increasingly think you're right about it being the best approach. It also fits better with the special casing PIDFD_SELF will want when that lands. So I'll see what an implementation with that approach looks like. >> If they want a PIDFD_THREAD pidfd, yes. I see it as similar to O_RDONLY, where its a >> flag that applies to the file descriptor but not to the underlying file. > This is probably fine. >> While ideally we'd implement it from an API completeness perspective, practically I'm >> not sure how often the option would ever be used. While there are hundreds of reasons >> why you might want to track the state of another process, I struggle to think of cases >> where Process A needs to track Process B's threads besides a debugger (and a debugger >> is probably better off using ptrace), and it can happily track its own threads by just >> holding onto the pidfd. > We recently imlemented PIDFD_THREAD support because it is used inside > Netflix. I forgot the details thought tbh. So it's actually used. We > only implemented it once people requested it. Oh, I entirely understand the utility of PIDFD_THREAD - I'm just not sure how mnay of those use cases are cross-process (and in the cases where they are cross process, how many of those uses would benefit from file handles vs fd passing)