Re: [PATCH] blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues

Laurence Oberman <loberman@xxxxxxxxxx> · Tue, 22 Feb 2022 09:31:50 -0500



On Mon, 2022-02-14 at 09:50 -0500, John Pittman wrote:
> This patch has now been tested in the customer environment and
> results
> were good (fixed the hangs).
> 
> On Mon, Feb 7, 2022 at 9:45 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
> > 
> > On Tue, Feb 1, 2022 at 4:34 AM David Jeffery <djeffery@xxxxxxxxxx>
> > wrote:
> > > 
> > > When blk_mq_delay_run_hw_queues sets an hctx to run in the
> > > future, it can
> > > reset the delay length for an already pending delayed work
> > > run_work. This
> > > creates a scenario where multiple hctx may have their queues set
> > > to run,
> > > but if one runs first and finds nothing to do, it can reset the
> > > delay of
> > > another hctx and stall the other hctx's ability to run requests.
> > > 
> > > To avoid this I/O stall when an hctx's run_work is already
> > > pending,
> > > leave it untouched to run at its current designated time rather
> > > than
> > > extending its delay. The work will still run which keeps closed
> > > the race
> > > calling blk_mq_delay_run_hw_queues is needed for while also
> > > avoiding the
> > > I/O stall.
> > > 

Hello
> > > Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx>
> > > ---
> > >  block/blk-mq.c |    8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > 
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index f3bf3358a3bb..ae46eb4bf547 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -2177,6 +2177,14 @@ void blk_mq_delay_run_hw_queues(struct
> > > request_queue *q, unsigned long msecs)
> > >         queue_for_each_hw_ctx(q, hctx, i) {
> > >                 if (blk_mq_hctx_stopped(hctx))
> > >                         continue;
> > > +               /*
> > > +                * If there is already a run_work pending, leave
> > > the
> > > +                * pending delay untouched. Otherwise, a hctx can
> > > stall
> > > +                * if another hctx is re-delaying the other's
> > > work
> > > +                * before the work executes.
> > > +                */
> > > +               if (delayed_work_pending(&hctx->run_work))
> > > +                       continue;
> > 
> > The issue is triggered on BFQ, since BFQ's has_work() may return
> > true,
> > however its ->dispatch_request() may return NULL, so
> > blk_mq_delay_run_hw_queues()
> > is run for delay schedule.
> > 
> > In case of multiple hw queue, the described issue may be triggered,
> > and cause io
> > stall for long time. And there are only 3 in-tree callers of
> > blk_mq_delay_run_hw_queues(),
> > David's fix works well for the 3 users, so this patch looks fine:
> > 
> > Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > 
> > Thanks,
> > 
> 
> 

Hello

Jens, gentle ping, can we get this in please
Sincerely

Laurence and the RH team