On Fri 21-07-23 17:57:15, Ming Lei wrote: > From: David Jeffery <djeffery@xxxxxxxxxx> > > Current code supposes that it is enough to provide forward progress by just > waking up one wait queue after one completion batch is done. > > Unfortunately this way isn't enough, cause waiter can be added to > wait queue just after it is woken up. > > Follows one example(64 depth, wake_batch is 8) > > 1) all 64 tags are active > > 2) in each wait queue, there is only one single waiter > > 3) each time one completion batch(8 completions) wakes up just one waiter in each wait > queue, then immediately one new sleeper is added to this wait queue > > 4) after 64 completions, 8 waiters are wakeup, and there are still 8 waiters in each > wait queue > > 5) after another 8 active tags are completed, only one waiter can be wakeup, and the other 7 > can't be waken up anymore. > > Turns out it isn't easy to fix this problem, so simply wakeup enough waiters for > single batch. > > Cc: David Jeffery <djeffery@xxxxxxxxxx> > Cc: Kemeng Shi <shikemeng@xxxxxxxxxxxxxxx> > Cc: Gabriel Krisman Bertazi <krisman@xxxxxxx> > Cc: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxx> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> I'm sorry for the delay - I was on vacation. I can see the patch got already merged and I'm not strictly against that (although I think Gabriel was experimenting with this exact wakeup scheme and as far as I remember the more eager waking up was causing performance decrease for some configurations). But let me challenge the analysis above a bit. For the sleeper to be added to a waitqueue in step 3), blk_mq_mark_tag_wait() must fail the blk_mq_get_driver_tag() call. Which means that all tags were used at that moment. To summarize, anytime we add any new waiter to the waitqueue, all tags are used and thus we should eventually receive enough wakeups to wake all of them. What am I missing? Honza > --- > lib/sbitmap.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/lib/sbitmap.c b/lib/sbitmap.c > index eff4e42c425a..d0a5081dfd12 100644 > --- a/lib/sbitmap.c > +++ b/lib/sbitmap.c > @@ -550,7 +550,7 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_min_shallow_depth); > > static void __sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) > { > - int i, wake_index; > + int i, wake_index, woken; > > if (!atomic_read(&sbq->ws_active)) > return; > @@ -567,13 +567,12 @@ static void __sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) > */ > wake_index = sbq_index_inc(wake_index); > > - /* > - * It is sufficient to wake up at least one waiter to > - * guarantee forward progress. > - */ > - if (waitqueue_active(&ws->wait) && > - wake_up_nr(&ws->wait, nr)) > - break; > + if (waitqueue_active(&ws->wait)) { > + woken = wake_up_nr(&ws->wait, nr); > + if (woken == nr) > + break; > + nr -= woken; > + } > } > > if (wake_index != atomic_read(&sbq->wake_index)) > -- > 2.40.1 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR