On Mon, 4 Feb 2019 at 07:24, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > + Jens, Christoph, Adrian, Linus > > On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@xxxxxxxxxxx> wrote: > > > > The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set. > > This generates a rescuer thread for that queue that will trigger when > > the CPU is under heavy load and collect the uncompleted work. > > > > In the case of mmc, this creates the possibility of a deadlock as > > other blk-mq is also run on the same queue. For example: > > > > - worker 0 claims the mmc host > > - worker 1 attempts to claim the host > > - worker 0 schedules complete_work to release the host > > - rescuer thread is triggered after time-out and collects the dangling > > work > > - rescuer thread attempts to complete the work in order starting with > > claim host > > - the task to release host is now blocked by a task to claim it and > > will never be called A second thought about this. Claiming and releasing the host, is a bit special managed in case the claiming is done to serve a block I/O request. The mmc host is actually re-claimable for these cases, which is needed to allow us to operate on two I/O requests simultaneously - for the same mmc host. mmc_claim_host() shouldn't even have to wait to retrieve access to the mmc host for these cases. So, it's a bit weird that you observes this deadlock/hang. Perhaps there is a problem internally with __mmc_claim_host() and mmc_release_host(), that we have overlooked when we introduced the re-claimable host for the block I/O path. There is a wait queue in there, perhaps that isn't working as we expect with this scenario... > > > > The above results in multiple hung tasks that lead to failures to boot. > > > > Switching complete_work to the system_highpri queue avoids this > > because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows > > the host to be released without getting blocked by other claims tasks. > > > [...] Kind regards Uffe