On Wed, Oct 09, 2024 at 01:38:20PM +0200, Christoph Hellwig wrote: > When del_gendisk shuts down access to a gendisk, it could lead to a > deadlock with sd or, which try to submit passthrough SCSI commands from > their ->release method under open_mutex. The submission can be blocked > in blk_enter_queue while del_gendisk can't get to actually telling them > top stop and wake them up. > > As the disk is going away there is no real point in sending these > commands, but we have no really good way to distinguish between the > cases. For now mark even standalone (aka SCSI queues) as dying in > del_gendisk to avoid this deadlock, but the real fix will be to split > freeing a disk from freezing a queue for not disk associated requests. > > Reported-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > Reviewed-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> > --- > block/genhd.c | 16 ++++++++++++++-- > include/linux/blkdev.h | 1 + > 2 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/block/genhd.c b/block/genhd.c > index 1c05dd4c6980b5..7026569fa8a0be 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -589,8 +589,16 @@ static void __blk_mark_disk_dead(struct gendisk *disk) > if (test_and_set_bit(GD_DEAD, &disk->state)) > return; > > - if (test_bit(GD_OWNS_QUEUE, &disk->state)) > - blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); > + /* > + * Also mark the disk dead if it is not owned by the gendisk. This > + * means we can't allow /dev/sg passthrough or SCSI internal commands > + * while unbinding a ULP. That is more than just a bit ugly, but until > + * we untangle q_usage_counter into one owned by the disk and one owned > + * by the queue this is as good as it gets. The flag will be cleared > + * at the end of del_gendisk if it wasn't set before. > + */ > + if (!test_and_set_bit(QUEUE_FLAG_DYING, &disk->queue->queue_flags)) > + set_bit(QUEUE_FLAG_RESURRECT, &disk->queue->queue_flags); Setting QUEUE_FLAG_DYING may fail passthrough request for !GD_OWNS_QUEUE, I guess this may cause SCSI regression. blk_queue_enter() need to wait until RESURRECT & DYING are cleared instead of returning failure. Thanks, Ming