On Fri, Oct 04, 2024 at 11:32:34PM +0900, Sergey Senozhatsky wrote: > Hmm, setting QUEUE_FLAG_DYING unconditionally in __blk_mark_disk_dead() > implies moving it up, to the very top of del_gendisk(), before the first > time we grab ->open_mutex, because that's the issue that we are having. > Does this sound like re-introducing the previous deadlock scenario (the > one you pointed at previously) because of that "don't acquire ->open_mutex > after freezing the queue" thing? So the trace of that one is literally the same as the one you reported, and I'm still wondering how they are related (I hope Yang Yang can chime in). I suspect that if we mark both the disk and queue dead early that will error out everything and should fix it. That would also avoid the issue with your patch in the next reply that would skip marking the disk dead when calling blk_mark_disk_dead. (BTW, we really need to write a big fat comment explaining how we ended up with whatever is the final fix here for the next person touching the code)