On Mon 14-11-22 09:20:57, Gabriel Krisman Bertazi wrote: > Jan Kara <jack@xxxxxxx> writes: > > > Gabriel, when looking through this patch, I've noticed we can loose wakeups > > after your latest simplifications. See below for details: > > > > On Sat 05-11-22 19:10:55, Gabriel Krisman Bertazi wrote: > >> @@ -587,7 +571,7 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) > >> for (i = 0; i < SBQ_WAIT_QUEUES; i++) { > >> struct sbq_wait_state *ws = &sbq->ws[wake_index]; > >> > >> - if (waitqueue_active(&ws->wait) && atomic_read(&ws->wait_cnt)) { > >> + if (waitqueue_active(&ws->wait)) { > >> if (wake_index != atomic_read(&sbq->wake_index)) > >> atomic_set(&sbq->wake_index, wake_index); > >> return ws; > > > > Neither sbq_wake_ptr() nor sbitmap_queue_wake_up() now increment the > > wake_index after performing the wakeup. Thus we would effectively keep > > waking tasks from a single waitqueue until it becomes empty and only then > > go for the next waitqueue. This creates unnecessary unfairness in task > > wakeups and even possible starvation issues. So I think we need to advance > > wake_index somewhere. Perhaps here before returning waitqueue. > > right. This is indeed a problem. what do you think of the patch below? > > > Now this may be also problematic - when we were checking the number of woken > > waiters in the older version of the patch (for others: internal version of > > the patch) this was fine but now it may happen that the 'ws' we have > > selected has no waiters anymore. And in that case we need to find another > > waitqueue because otherwise we'd be loosing too many wakeups and we could > > deadlock. So I think this rather needs to be something like: > > > > do { > > if (atomic_read(&sbq->completion_cnt) - wakeups < wake_batch) > > return; > > } while (!atomic_try_cmpxchg(&sbq->wakeup_cnt, > > &wakeups, wakeups + wake_batch)); > > > > do { > > ws = sbq_wake_ptr(sbq); > > if (!ws) > > return; > > } while (!wake_up_nr(&ws->wait, wake_batch)); > > > > with our original version of wake_up_nr() returning number of woken > > waiters. What do you think? > > I agree, and it wouldn't happen with the wake_up_nr patch we had before. > I will revive it quickly and follow up. But, in this case, I want to be > cautious with benchmarking, so I will follow up still today, but as soon > as the new round of tests complete. Sure. > -- >8 -- > Subject: [PATCH] sbitmap: Advance the queue index before waking up the queue > > When a queue is awaken, the wake_index written by sbq_wake_ptr currently > keeps pointing to the same queue. On the next wake up, it will thus > retry the same queue, which is unfair to other queues, and can lead to > starvation. This patch, moves the index update to happen before the > queue is returned, such that it will now try a different queue first on > the next wake up, improving fairness. > > Reported-by: Jan Kara <jack@xxxxxxx> > Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxx> Yes, nice. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > lib/sbitmap.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/lib/sbitmap.c b/lib/sbitmap.c > index eca462cba398..bea7984f7987 100644 > --- a/lib/sbitmap.c > +++ b/lib/sbitmap.c > @@ -571,13 +571,19 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) > for (i = 0; i < SBQ_WAIT_QUEUES; i++) { > struct sbq_wait_state *ws = &sbq->ws[wake_index]; > > + /* > + * Advance the index before checking the current queue. > + * It improves fairness, by ensuring the queue doesn't > + * need to be fully emptied before trying to wake up > + * from the next one. > + */ > + wake_index = sbq_index_inc(wake_index); > + > if (waitqueue_active(&ws->wait)) { > if (wake_index != atomic_read(&sbq->wake_index)) > atomic_set(&sbq->wake_index, wake_index); > return ws; > } > - > - wake_index = sbq_index_inc(wake_index); > } > > return NULL; > -- > 2.35.3 > > > > > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR