On 28 Apr 2022, at 9:51, Hannes Reinecke wrote: > On 4/28/22 15:30, Jakub Kicinski wrote: >> On Thu, 28 Apr 2022 09:26:41 +0200 Hannes Reinecke wrote: >>> The whole thing started off with the problem on _how_ sockets could be >>> passed between kernel and userspace and vice versa. >>> While there is fd passing between processes via AF_UNIX, there is no >>> such mechanism between kernel and userspace. >> >> Noob question - the kernel <> user space FD sharing is just >> not implemented yet, or somehow fundamentally hard because kernel >> fds are "special"? > > Noob reply: wish I knew. (I somewhat hoped _you_ would've been able to > tell me.) > > Thing is, the only method I could think of for fd passing is the POSIX fd > passing via unix_attach_fds()/unix_detach_fds(). But that's AF_UNIX, > which really is designed for process-to-process communication, not > process-to-kernel. So you probably have to move a similar logic over to > AF_NETLINK. And design a new interface on how fds should be passed over > AF_NETLINK. > > But then you have to face the issue that AF_NELINK is essentially UDP, and > you have _no_ idea if and how many processes do listen on the other end. > Thing is, you (as the sender) have to copy the fd over to the receiving > process, so you'd better _hope_ there is a receiving process. Not to > mention that there might be several processes listening in... > > And that's something I _definitely_ don't feel comfortable with without > guidance from the networking folks, so I didn't pursue it further and we > went with the 'accept()' mechanism Chuck implemented. > > I'm open to suggestions, though. EXPORT_SYMBOL(receive_fd) would allow interesting implementations. The kernel keyring facilities have a good API for creating various key_types which are able to perform work such as this from userspace contexts. I have a working prototype for a keyring key instantiation which allows a userspace process to install a kernel fd on its file table. The problem here is how to match/route such fd passing to appropriate processes in appropriate namespaces. I think this problem is shared by all kernel-to-userspace upcalls, which I hope we can discuss at LSF/MM. I don't think kernel fds are very special as compared to userspace fds. Ben