On Sun, Jul 24, 2022 at 4:21 PM Sagi Grimberg <sagi@xxxxxxxxxxx> wrote: > > > >> The problem is that nvme_wq is MEM_RECLAIM, and nvme_tcp_wq is > >> for the socket threads, that does not need to be MEM_RECLAIM workqueue. > > > > Why don't we need MEM_RECLAIM for the socket threads? > > > >> But reset/error-recovery that take place on nvme_wq, stop nvme-tcp > >> queues, and that must involve flushing queue->io_work in order to > >> fence concurrent execution. > >> > >> So what is the solution? make nvme_tcp_wq MEM_RECLAIM? > > > > I think so. > > OK. > > Yi, does this patch makes the issue go away? I tried to find one server to manually reproduce the issue but with no luck reproducing it, since it has been merged, I will keep monitoring this issue from the CKI tests. > -- > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 0a9542599ad1..dc3b4dc8fe08 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -1839,7 +1839,8 @@ static int __init nvmet_tcp_init(void) > { > int ret; > > - nvmet_tcp_wq = alloc_workqueue("nvmet_tcp_wq", WQ_HIGHPRI, 0); > + nvmet_tcp_wq = alloc_workqueue("nvmet_tcp_wq", > + WQ_MEM_RECLAIM | WQ_HIGHPRI, 0); > if (!nvmet_tcp_wq) > return -ENOMEM; > -- > -- Best Regards, Yi Zhang