On Mon, 16 Nov 2020 11:10:01 -0500 Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote: > From: Willem de Bruijn <willemb@xxxxxxxxxx> > > Add epoll_create1 flag EPOLL_NSTIMEO. When passed, this changes the > interpretation of argument timeout in epoll_wait from msec to nsec. > > Use cases such as datacenter networking operate on timescales well > below milliseconds. Shorter timeouts bounds their tail latency. > The underlying hrtimer is already programmed with nsec resolution. hm, maybe. It's not very nice to be using one syscall to alter the interpretation of another syscall's argument in this fashion. For example, one wonders how strace(1) is to properly interpret & display this argument? Did you consider adding epoll_wait2()/epoll_pwait2() syscalls which take a nsec timeout? Seems simpler. Either way, we'd be interested in seeing the proposed manpage updates alongside this change. > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -225,6 +225,9 @@ struct eventpoll { > unsigned int napi_id; > #endif > > + /* Accept timeout in ns resolution (EPOLL_NSTIMEO) */ > + unsigned int nstimeout:1; > + Why a bitfield? This invites other developers to add new bitfields to the same word. And if that happens, we'll need to consider the locking rules for that word - I don't think the compiler provides any protection against concurrent modifications to the bitfields which share a machine word. If we add a rule /* * per eventpoll flags. Initialized at creation time, never changes * thereafter */ then that would cover it. Or just make the thing a `bool'?