Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue

Christoph Hellwig <hch@xxxxxx> · Mon, 4 Feb 2019 15:30:05 +0100

On Mon, Feb 04, 2019 at 10:02:18AM +0200, Adrian Hunter wrote:
> On 4/02/19 8:24 AM, Ulf Hansson wrote:
> > + Jens, Christoph, Adrian, Linus
> > 
> > On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@xxxxxxxxxxx> wrote:
> >>
> >> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
> >> This generates a rescuer thread for that queue that will trigger when
> >> the CPU is under heavy load and collect the uncompleted work.
> >>
> >> In the case of mmc, this creates the possibility of a deadlock as
> >> other blk-mq is also run on the same queue. For example:
> >>
> >> - worker 0 claims the mmc host
> >> - worker 1 attempts to claim the host
> >> - worker 0 schedules complete_work to release the host
> >> - rescuer thread is triggered after time-out and collects the dangling
> >>   work
> >> - rescuer thread attempts to complete the work in order starting with
> >>   claim host
> >> - the task to release host is now blocked by a task to claim it and
> >>   will never be called
> >>
> >> The above results in multiple hung tasks that lead to failures to boot.
> >>
> >> Switching complete_work to the system_highpri queue avoids this
> >> because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> >> the host to be released without getting blocked by other claims tasks.
> >>
> > 
> > Thanks for fix and the detailed description to the problem!
> > 
> >> Signed-off-by: Zachary Hays <zhays@xxxxxxxxxxx>
> >> ---
> >>  drivers/mmc/core/block.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> >> index aef1185f383d..59b6b41b84c6 100644
> >> --- a/drivers/mmc/core/block.c
> >> +++ b/drivers/mmc/core/block.c
> >> @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq)
> >>                 if (waiting)
> >>                         wake_up(&mq->wait);
> >>                 else
> >> -                       kblockd_schedule_work(&mq->complete_work);
> >> +                       queue_work(system_highpri_wq, &mq->complete_work);
> > 
> > Even if this solves the problem, I think we need some input from some
> > of the block experts/maintainers to understand if this is the correct
> > way to fix the problem. So, I have lopped them in.
> > 
> > I am guessing MMC is not the only block device driver that have this
> > kind of locking issue. Or perhaps it is..
> 
> WRT kblockd_workqueue, there is also still this issue outstanding:
> 
> 	https://lore.kernel.org/lkml/20170921140729.GA17333@xxxxxx/

Did you post your summary to the block list?  Or even looked into a
prototype that uses a kthread for the deferred submission in blk-mq
instead of a workqueue?