The problem is that nvme_wq is MEM_RECLAIM, and nvme_tcp_wq is
for the socket threads, that does not need to be MEM_RECLAIM workqueue.
Why don't we need MEM_RECLAIM for the socket threads?
But reset/error-recovery that take place on nvme_wq, stop nvme-tcp
queues, and that must involve flushing queue->io_work in order to
fence concurrent execution.
So what is the solution? make nvme_tcp_wq MEM_RECLAIM?
I think so.
OK.
Yi, does this patch makes the issue go away?
--
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 0a9542599ad1..dc3b4dc8fe08 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1839,7 +1839,8 @@ static int __init nvmet_tcp_init(void)
{
int ret;
- nvmet_tcp_wq = alloc_workqueue("nvmet_tcp_wq", WQ_HIGHPRI, 0);
+ nvmet_tcp_wq = alloc_workqueue("nvmet_tcp_wq",
+ WQ_MEM_RECLAIM | WQ_HIGHPRI, 0);
if (!nvmet_tcp_wq)
return -ENOMEM;
--