On Wed, 6 Jul 2011, Roland Dreier wrote: > On Wed, Jul 6, 2011 at 9:53 AM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > He probably meant blk_execute_rq_nowait(). �The test has to be done > > before the elevator is accessed. > > Hmm, seems we would need the test in multiple places, since my second call > trace is io_schedule -> blk_flush_plug_list -> queue_unplugged -> > __blk_run_queue > > So I don't think I hit blk_execute_rq_nowait in my crash. > > But maybe the problem is that dm-multipath is trying to requeue the IO to an > underlying sdX device that is already dead? I'm not at all familiar with the block layer. It seems that the check for a dead queue would have to be made on every path that ends up calling the elevator, which would be a difficult sort of thing to enforce. I'm not too sure about James's comment: > Moving the > queue free is wrong ... it recently moved to fix another oops. Apparently this refers to commit e73e079bf128d68284efedeba1fbbc18d78610f9 ([SCSI] Fix oops caused by queue refcounting failure). In fact that commit does _not_ move the call to scsi_free_queue(). Instead it merely takes another reference to the queue, so that scsi_free_queue() doesn't actually deallocate the queue. But it does still deallocate the elevator. Perhaps this means the elevator shouldn't be freed until the queue is. I just don't know. Jens and James are the experts, but Jens hasn't said anything and James is currently busy. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html