On Thu, Jan 31, 2019 at 09:00:42PM +0000, Jason Gunthorpe wrote: > On Thu, Jan 31, 2019 at 01:56:15PM -0700, Parav Pandit wrote: > > > > > > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > > > owner@xxxxxxxxxxxxxxx> On Behalf Of Steve Wise > > > Sent: Thursday, January 31, 2019 1:31 PM > > > To: dledford@xxxxxxxxxx; Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > > Cc: linux-rdma@xxxxxxxxxxxxxxx > > > Subject: [PATCH rdma-next] RDMA/addr: create addr_wq with > > > WQ_MEM_RECLAIM flag > > > > > > While running NVMe/oF wire unplug tests, we hit this warning in > > > kernel/workqueue.c:check_flush_dependency(): > > > > > > WARN_ONCE(worker && ((worker->current_pwq->wq->flags & > > > (WQ_MEM_RECLAIM | __WQ_LEGACY)) == > > > WQ_MEM_RECLAIM), > > > "workqueue: WQ_MEM_RECLAIM %s:%pf is flushing > > > !WQ_MEM_RECLAIM %s:%pf", > > > worker->current_pwq->wq->name, worker->current_func, > > > target_wq->name, target_func); > > > > > > Which I think means we're flushing a workq that doesn't have > > > WQ_MEM_RECLAIM set, from workqueue context that does have it set. > > > > > > Looking at rdma_addr_cancel() which is doing the flushing, it flushes the > > > addr_wq which doesn't have MEM_RECLAIM set. Yet rdma_addr_cancel() is > > > being called by the nvme host connection timeout/reconnect workqueue > > > thread that does have WQ_MEM_RECLAIM set. > > > > > > So set WQ_MEM_RECLAIM on the addr_req workqueue. > > > > > > > Please add below [1] fixes by tag. > > I removed this flag based on commit [2] of Sagi and discussion[3]. > > Which I think [2] and [3] are incorrect. > > > > Memory reclaim path could have been triggered by processes or kernel > > and I think it is ok to allocate a memory in a work item trigger as > > part of reclaim path or otherwise. Memory allocation in a wq has > > nothing to do with memory reclaim. I wish if Tejun or Christoph > > confirm this is right or correct our understanding. > > Considering the only thing WQ_MEM_RECLAIM does is to pre-allocate a > execution progress to guarentee forward progress if the system is > unable to succed in memory allocations.. > > It seems a bit goofy to say that it is OK to then block on memory > allocations inside that precious execution context.. > > I also would like to understand what the rules should be for this flag > before applying this patch?? > > > Based these past discussions, it appears to me that WQ_MEM_RECLAIM > > limitation and its use is not clear. > > I've always thought it was designed to guarentee forward progress if > memory cannot be allocated. I understood it very similar except the last part and my version is: "it was designed to guarantee forward progress and prioritized if memory shrinker is called." Thanks > > Jason
Attachment:
signature.asc
Description: PGP signature