On 11/06, Willem de Bruijn wrote: > > > > > I think my other issue with MSG_SOCK_DEVMEM being on recvmsg is that > > > > > it somehow implies that I have an option of passing or not passing it > > > > > for an individual system call. > > > > > If we know that we're going to use dmabuf with the socket, maybe we > > > > > should move this flag to the socket() syscall? > > > > > > > > > > fd = socket(AF_INET6, SOCK_STREAM, SOCK_DEVMEM); > > > > > > > > > > ? > > > > > > > > I think it should then be a setsockopt called before any data is > > > > exchanged, with no change of modifying mode later. We generally use > > > > setsockopts for the mode of a socket. This use of the protocol field > > > > in socket() for setting a mode would be novel. Also, it might miss > > > > passively opened connections, or be overly restrictive: one approach > > > > for all accepted child sockets. > > > > > > I was thinking this is similar to SOCK_CLOEXEC or SOCK_NONBLOCK? There > > > are plenty of bits we can grab. But setsockopt works as well! > > > > To follow up: if we have this flag on a socket, not on a per-message > > basis, can we also use recvmsg for the recycling part maybe? > > > > while (true) { > > memset(msg, 0, ...); > > > > /* receive the tokens */ > > ret = recvmsg(fd, &msg, 0); > > > > /* recycle the tokens from the above recvmsg() */ > > ret = recvmsg(fd, &msg, MSG_RECYCLE); > > } > > > > recvmsg + MSG_RECYCLE can parse the same format that regular recvmsg > > exports (SO_DEVMEM_OFFSET) and we can also add extra cmsg option > > to recycle a range. > > > > Will this be more straightforward than a setsockopt(SO_DEVMEM_DONTNEED)? > > Or is it more confusing? > > It would have to be sendmsg, as recvmsg is a copy_to_user operation. > > > I am not aware of any precedent in multiplexing the data stream and a > control operation stream in this manner. It would also require adding > a branch in the sendmsg hot path. Is it too much plumbing to copy_from_user msg_control deep in recvmsg stack where we need it? Mixing in sendmsg is indeed ugly :-( Regarding hot patch: aren't we already doing copy_to_user for the tokens in this hot path, so having one extra condition shouldn't hurt too much? > The memory is associated with the socket, freed when the socket is > closed as well as on SO_DEVMEM_DONTNEED. Fundamentally it is a socket > state operation, for which setsockopt is the socket interface. > > Is your request purely a dislike, or is there some technical concern > with BPF and setsockopt? It's mostly because I've been bitten too much by custom socket options that are not really on/off/update-value operations: 29ebbba7d461 - bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen 00e74ae08638 - bpf: Don't EFAULT for getsockopt with optval=NULL 9cacf81f8161 - bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVE d8fe449a9c51 - bpf: Don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE I do agree that this particular case of SO_DEVMEM_DONTNEED seems ok, but things tend to evolve and change.