+Lennart, since systemd is the only userspace I know of currently making use of this. On Mon, May 02, 2022 at 04:06:01PM +0200, Jason A. Donenfeld wrote: > Events that poll() responds to are supposed to be consumed when the file > is read(), not by the poll() itself. By putting it on the poll() itself, > it makes it impossible to poll() on a epoll file descriptor, since the > event gets consumed too early. Jann wrote a PoC, available in the link > below. > > Reported-by: Jann Horn <jannh@xxxxxxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx> > Cc: linux-fsdevel@xxxxxxxxxxxxxxx > Link: https://lore.kernel.org/lkml/CAG48ez1F0P7Wnp=PGhiUej=u=8CSF6gpD9J=Oxxg0buFRqV1tA@xxxxxxxxxxxxxx/ > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx> > --- > fs/proc/proc_sysctl.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index 7d9cfc730bd4..1aa145794207 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -622,6 +622,14 @@ static ssize_t proc_sys_call_handler(struct kiocb *iocb, struct iov_iter *iter, > > static ssize_t proc_sys_read(struct kiocb *iocb, struct iov_iter *iter) > { > + struct inode *inode = file_inode(iocb->ki_filp); > + struct ctl_table_header *head = grab_header(inode); > + struct ctl_table *table = PROC_I(inode)->sysctl_entry; > + > + if (!IS_ERR(head) && table->poll) > + iocb->ki_filp->private_data = proc_sys_poll_event(table->poll); > + sysctl_head_finish(head); > + > return proc_sys_call_handler(iocb, iter, 0); > } > > @@ -668,10 +676,8 @@ static __poll_t proc_sys_poll(struct file *filp, poll_table *wait) > event = (unsigned long)filp->private_data; > poll_wait(filp, &table->poll->wait, wait); > > - if (event != atomic_read(&table->poll->event)) { > - filp->private_data = proc_sys_poll_event(table->poll); > + if (event != atomic_read(&table->poll->event)) > ret = EPOLLIN | EPOLLRDNORM | EPOLLERR | EPOLLPRI; > - } > > out: > sysctl_head_finish(head); > -- > 2.35.1 Just wanted to double check with you that this change wouldn't break how you're using it in systemd for /proc/sys/kernel/hostname: https://github.com/systemd/systemd/blob/39cd62c30c2e6bb5ec13ebc1ecf0d37ed015b1b8/src/journal/journald-server.c#L1832 https://github.com/systemd/systemd/blob/39cd62c30c2e6bb5ec13ebc1ecf0d37ed015b1b8/src/resolve/resolved-manager.c#L465 I couldn't find anybody else actually polling on it. Interestingly, it looks like sd_event_add_io uses epoll() inside, but you're not hitting the bug that Jann pointed out (because I suppose you're not poll()ing on an epoll fd). Jason