On Thu, Nov 24, 2022 at 06:57:41PM +0100, Paolo Abeni wrote: > To reduce the contention this patch introduces explicit reference counting > for the eventpoll struct. Each registered event acquires a reference, > and references are released at ep_remove() time. ep_free() doesn't touch > anymore the event RB tree, it just unregisters the existing callbacks > and drops a reference to the ep struct. The struct itself is freed when > the reference count reaches 0. The reference count updates are protected > by the mtx mutex so no additional atomic operations are needed. So, the behavior before this patch is that closing an epoll file frees all resources associated with it. This behavior is documented in the man page epoll_create(2): "When all file descriptors referring to an epoll instance have been closed, the kernel destroys the instance and releases the associated resources for reuse." The behavior after this patch is that the resources aren't freed until the epoll file *and* all files that were added to it have been closed. Is that okay? I suppose in most cases it is, since the usual use case for epoll is to have a long-lived epoll instance and shorter lived file descriptors that are polled using that long-lived epoll instance. But probably some users do things the other way around. I.e., they have a long-lived file descriptor that is repeatedly polled using different epoll instances that have a shorter lifetime. In that case, the number of 'struct eventpoll' and 'struct epitem' in kernel memory will keep growing until 'max_user_watches' is hit, at which point EPOLL_CTL_ADD will start failing with ENOSPC. Are you sure that is fine? I'll also note that there is a comment at the top of fs/eventpoll.c that describes the locking scheme, which this patch forgets to update. - Eric