Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue

Ulf Hansson <ulf.hansson@xxxxxxxxxx> · Mon, 4 Feb 2019 13:52:49 +0100

On Mon, 4 Feb 2019 at 07:24, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> + Jens, Christoph, Adrian, Linus
>
> On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@xxxxxxxxxxx> wrote:
> >
> > The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
> > This generates a rescuer thread for that queue that will trigger when
> > the CPU is under heavy load and collect the uncompleted work.
> >
> > In the case of mmc, this creates the possibility of a deadlock as
> > other blk-mq is also run on the same queue. For example:
> >
> > - worker 0 claims the mmc host
> > - worker 1 attempts to claim the host
> > - worker 0 schedules complete_work to release the host
> > - rescuer thread is triggered after time-out and collects the dangling
> >   work
> > - rescuer thread attempts to complete the work in order starting with
> >   claim host
> > - the task to release host is now blocked by a task to claim it and
> >   will never be called

A second thought about this.

Claiming and releasing the host, is a bit special managed in case the
claiming is done to serve a block I/O request. The mmc host is
actually re-claimable for these cases, which is needed to allow us to
operate on two I/O requests simultaneously - for the same mmc host.

mmc_claim_host() shouldn't even have to wait to retrieve access to the
mmc host for these cases. So, it's a bit weird that you observes this
deadlock/hang.

Perhaps there is a problem internally with __mmc_claim_host() and
mmc_release_host(), that we have overlooked when we introduced the
re-claimable host for the block I/O path. There is a wait queue in
there, perhaps that isn't working as we expect with this scenario...

> >
> > The above results in multiple hung tasks that lead to failures to boot.
> >
> > Switching complete_work to the system_highpri queue avoids this
> > because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> > the host to be released without getting blocked by other claims tasks.
> >
>

[...]

Kind regards
Uffe