On Mon, Mar 18, 2024 at 01:20:04AM +0300, Ivan Trofimov wrote: > Instead of unconditionally allocating and deallocating pwq objects, > try to reuse them by storing the entry in the eventpoll struct > at deallocation request, and consuming that entry at allocation request. > This way every EPOLL_CTL_ADD operation immediately following an > EPOLL_CTL_DEL operation effectively cancels out its pwq allocation > with the preceding deallocation. > > With this patch applied I'm observing ~13% overall speedup when > benchmarking the following scenario: > 1. epoll_ctl(..., EPOLL_CTL_ADD, ...) > 2. epoll_ctl(..., EPOLL_CTL_DEL, ...) > which should be a pretty common one for either applications dealing with > a lot of short-lived connections or applications doing a DEL + ADD dance > per level-triggered FD readiness. > > This optimization comes with a sizeof(void*) + sizeof(struct eppoll_entry) > per-epoll-instance memory cost, which amounts to 72 bytes for 64-bit. > > Signed-off-by: Ivan Trofimov <i.trofimow@xxxxxxxxx> > --- > This is my first ever attempt at submiting a patch for the kernel, > so please don't hesitate to point out mistakes I'm doing in the process. > > fs/eventpoll.c | 39 +++++++++++++++++++++++++++++++++++++-- > 1 file changed, 37 insertions(+), 2 deletions(-) > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > index 882b89edc..c4094124c 100644 > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -219,6 +219,9 @@ struct eventpoll { > u64 gen; > struct hlist_head refs; > > + /* a single-item cache used to reuse eppoll_entry allocations */ > + struct eppoll_entry *pwq_slot; > + > /* > * usage count, used together with epitem->dying to > * orchestrate the disposal of this struct > @@ -648,6 +651,36 @@ static void ep_remove_wait_queue(struct eppoll_entry *pwq) > rcu_read_unlock(); > } > > +/* > + * This functions either consumes the pwq_slot, or allocates a new "This function" > @@ -789,6 +822,8 @@ static void ep_free(struct eventpoll *ep) > mutex_destroy(&ep->mtx); > free_uid(ep->user); > wakeup_source_unregister(ep->ws); > + if (ep->pwq_slot) > + kmem_cache_free(pwq_cache, ep->pwq_slot); A NULL check isn't necessary before kmem_cache_free(). > @@ -1384,7 +1419,7 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead, > if (unlikely(!epi)) // an earlier allocation has failed > return; > > - pwq = kmem_cache_alloc(pwq_cache, GFP_KERNEL); > + pwq = ep_alloc_pwq(epi->ep); What guarantees that ep->mtx is held here? - Eric