About nvme_stop_queues need long times for large number namespaces, If work with multipath and one path fails, It cause wait long times to fail over to retry, and the more namespaces the longer the time. This has a great impact on delay-sensitive services. there are two options to fix it: 1. Use percpu instead of SRCU. Ming's patchset. 2. Use tagset quiesce interface with SRCU. Sagi's patchset. The two patchsets are still pending. It is a serious bug, I expect that we can revisit the solution. Maybe we don't have the best option, but we need to choose a relatively acceptable option. Can we fix the bug for non-blocking queues(which used by fc&rdma) first? Sagi & Ming, what do you think?
I don't recall any outstanding concerns that I had (I think they were all addressed). I'm fine with moving forward with it.