Hi, On Mon, Jul 26, 2021 at 02:14:30PM +0000, Wen Xiong wrote: > >>V6 is basically same with V4, can you figure out where the failure > >>comes?(v5.14-rc2, V6 or Daniel's V3) > > Looks 3/3 was not patched cleanly in v5.14-rc2 last week. I made the changes > in block/blk-mq.c.rej manually but still missed the last part of 3/3 patch. Sorry for the long delay on my side. It took a while to get my test setup running again. The qla2xxx driver really doesn't like 'fast' remote port toggling. But that's a different story. Anyway, it turns out that my patch series is still not working correctly. When I tested the series I deliberate forced to execute the 'revising io queue count' path in nvme_fc_recreate_io_queues by doing: --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2954,7 +2954,7 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl) if (ctrl->ctrl.queue_count == 1) return 0; - if (prior_ioq_cnt != nr_io_queues) { + if (prior_ioq_cnt != nr_io_queues - 1) { dev_info(ctrl->ctrl.device, "reconnect: revising io queue count from %d to %d\n", prior_ioq_cnt, nr_io_queues); With this change I can't observe any I/O hanging. Without the change it hangs in the first iteration. In Wen's setup we observed in earlier debugging sessions that the nr_io_queues does change. So this explains why Wen doesn't see any hanging I/Os. @James, I think we need to look at bit more at the freeze code. BTW, my initial patch which just added a nvme_start_freeze() in nvme_fc_delete_association() doesn't work either for the 'prior_ioq_cnt == nr_io_queues' case. So I think Ming series can be merged as the hanging I/Os are clearly not caused by the series. Feel free to add Tested-by: Daniel Wagner <dwagner@xxxxxxx> Thanks, Daniel