On 2014-10-30 11:45, Christoph Hellwig wrote:
On Thu, Oct 30, 2014 at 07:32:52PM +0200, Meelis Roos wrote:
can you try the patch below? It's a hack and not a proper fix, but it
addresses what seems to be your culprit, given that it is the only
place allocating a request from the error handler.
Applied it on top of 3.18-rc2, booted with scsi_mod.use_blk_mq=1 and it
booted up fine.
Jens,
any idea what we could do here? We want to lock the door again ASAP
after potentially resetting the device state as far as I can read
the code (the commit message for it is utterly meaningless).
Right now the code allocates the request from the scsi EH thread, which
already is dangerous but mostly works for the !blk-mq case, but with the
strict only allocate a request if a tag is available policy this breaks
down if we still have BLOCK_PC requests that have references on them
blocking another request queued (ATA cdroms tend to have a queue depth
of 1).
Given that this always was best effort anyway we might want to move it
to a separate workqueue to not block EH?
So what we usually do for tagged devices that need some command for
error handling etc, is to have one tag reserved. The lock/unlock should
probably be using a reserved request, given how it is invoked as error
handling. Right now we don't reserve a tag for untagged things like PATA
cdrom, but we could, since they don't care about the tag anyway. And if
we had that and reserved grab in the scsi_eh_lock_door(), it should just
work.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html