Re: [PATCH rdma-next] RDMA/addr: create addr_wq with WQ_MEM_RECLAIM flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/6/2019 3:52 PM, Steve Wise wrote:
> On 2/5/2019 4:39 PM, Jason Gunthorpe wrote:
>> On Thu, Jan 31, 2019 at 11:30:42AM -0800, Steve Wise wrote:
>>> While running NVMe/oF wire unplug tests, we hit this warning in
>>> kernel/workqueue.c:check_flush_dependency():
>>>
>>> WARN_ONCE(worker && ((worker->current_pwq->wq->flags &
>>> 		      (WQ_MEM_RECLAIM | __WQ_LEGACY)) == WQ_MEM_RECLAIM),
>>> 	  "workqueue: WQ_MEM_RECLAIM %s:%pf is flushing !WQ_MEM_RECLAIM %s:%pf",
>>> 	  worker->current_pwq->wq->name, worker->current_func,
>>> 	  target_wq->name, target_func);
>>>
>>> Which I think means we're flushing a workq that doesn't have
>>> WQ_MEM_RECLAIM set, from workqueue context that does have it set.
>>>
>>> Looking at rdma_addr_cancel() which is doing the flushing, it flushes
>>> the addr_wq which doesn't have MEM_RECLAIM set.  Yet rdma_addr_cancel()
>>> is being called by the nvme host connection timeout/reconnect workqueue
>>> thread that does have WQ_MEM_RECLAIM set.
>> Since we haven't learned anything more, I think you should look to
>> remove either the WQ_MEM_RECLAIM or the rdma_addr_cancel() from the
>> nvme side.
> The nvme code is just calling rdma_destroy_id(), which in turn calls
> rdma_addr_cancel(),  so I'll have to remove WQ_MEM_RECLAIM from the
> workqueue.
>
> I'll post this patch then.


What a mess.  If I remove RECLAIM for nvme_wq, then I'll regress this
change, I think:

c669ccdc50c2 ("nvme: queue ns scanning and async request from nvme_wq")





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux