On Sat, Apr 25, 2020 at 9:17 AM Jason Baron <jbaron@xxxxxxxxxx> wrote: > > > > On 4/24/20 3:00 PM, Khazhismel Kumykov wrote: > > In the event that we add to ovflist, before 339ddb53d373 we would be > > woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback. > > With that wakeup removed, if we add to ovflist here, we may never wake > > up. Rather than adding back the ep_scan_ready_list wakeup - which was > > resulting in unnecessary wakeups, trigger a wake-up in ep_poll_callback. > > I'm just curious which 'wakeup' we are talking about here? There is: > wake_up(&ep->wq) for the 'ep' and then there is the nested one via: > ep_poll_safewake(ep, epi). It seems to me that its only about the later > one being missing not both? Is your workload using nested epoll? > > If so, it might make sense to just do the later, since the point of > the original patch was to minimize unnecessary wakeups. The missing wake-ups were when we added to ovflist instead of rdllist. Both are "the ready list" together - so I'd think we'd want the same wakeups regardless of which specific list we added to. ep_poll_callback isn't nested specific? > > Thanks, > > -Jason > > > > > We noticed that one of our workloads was missing wakeups starting with > > 339ddb53d373 and upon manual inspection, this wakeup seemed missing to > > me. With this patch added, we no longer see missing wakeups. I haven't > > yet tried to make a small reproducer, but the existing kselftests in > > filesystem/epoll passed for me with this patch. > > > > Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") > > > > Signed-off-by: Khazhismel Kumykov <khazhy@xxxxxxxxxx> > > Reviewed-by: Roman Penyaev <rpenyaev@xxxxxxx> > > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> > > Cc: Heiher <r@xxxxxx> > > Cc: Jason Baron <jbaron@xxxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> > > --- > > v2: use if/elif instead of goto + cleanup suggested by Roman > > fs/eventpoll.c | 18 +++++++++--------- > > 1 file changed, 9 insertions(+), 9 deletions(-) > > > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > > index 8c596641a72b..d6ba0e52439b 100644 > > --- a/fs/eventpoll.c > > +++ b/fs/eventpoll.c > > @@ -1171,6 +1171,10 @@ static inline bool chain_epi_lockless(struct epitem *epi) > > { > > struct eventpoll *ep = epi->ep; > > > > + /* Fast preliminary check */ > > + if (epi->next != EP_UNACTIVE_PTR) > > + return false; > > + > > /* Check that the same epi has not been just chained from another CPU */ > > if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR) > > return false; > > @@ -1237,16 +1241,12 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v > > * chained in ep->ovflist and requeued later on. > > */ > > if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) { > > - if (epi->next == EP_UNACTIVE_PTR && > > - chain_epi_lockless(epi)) > > + if (chain_epi_lockless(epi)) > > + ep_pm_stay_awake_rcu(epi); > > + } else if (!ep_is_linked(epi)) { > > + /* In the usual case, add event to ready list. */ > > + if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist)) > > ep_pm_stay_awake_rcu(epi); > > - goto out_unlock; > > - } > > - > > - /* If this file is already in the ready list we exit soon */ > > - if (!ep_is_linked(epi) && > > - list_add_tail_lockless(&epi->rdllink, &ep->rdllist)) { > > - ep_pm_stay_awake_rcu(epi); > > } > > > > /* > >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature