On Tue, Apr 19, 2022 at 6:42 PM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > Hey Jann, > > On Tue, Apr 19, 2022 at 6:38 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > This is a bit of a weird API, because normally .poll is supposed to be > > level-triggered rather than edge-triggered... and AFAIK things like > > epoll also kinda assume that ->poll() doesn't modify state (but that > > only _really_ matters in weird cases). But at the same time, it looks > > like the existing proc_sys_poll() already goes against that? So I > > don't know what the right thing to do there is... > > Doesn't the level vs edge distinction apply to POLLIN/POLLOUT events? I don't see why it would be limited to that. > In this case, the event generated is actually POLLERR. On one hand, > this is sort of weird. On the other hand, it perhaps makes sense, > since nothing changes respect to its readability/writeability. And it > also happens to be how the sysctl poll() infrastructure was designed; > I didn't need to change anything for this behavior, and it comes as a > result of this rather trivial commit only. Looking at where else it's > used, it appears to be the intended use case for changes to > hostname/domainname. So while it's unusual, it also appears to be the > usual way that sysctl poll() works. So perhaps we're quite lucky here > in that sysctl poll() winds up being the correct interface for what we > want? AFAIK this also means that if you make an epoll watch for /proc/sys/kernel/random/fork_event, and then call poll() *on the epoll fd* for some reason, that will probably already consume the event; and if you then try to actually receive the epoll event via epoll_wait(), it'll already be gone (because epoll tries to re-poll the "ready" files to figure out what state those files are at now). Similarly if you try to create an epoll watch for an FD that already has an event pending: Installing the watch will call the ->poll handler once, resetting the file's state, and the following epoll_wait() will call ->poll again and think the event is already gone. See the call paths to vfs_poll() in fs/eventpoll.c. Maybe we don't care about such exotic usage, and are willing to accept the UAPI inconsistency and slight epoll breakage of plumbing edge-triggered polling through APIs designed for level-triggered polling. IDK.