On Mon, May 06, 2019 at 10:03:56PM +0300, Leon Romanovsky wrote: > On Mon, May 06, 2019 at 03:16:10PM -0300, Jason Gunthorpe wrote: > > On Mon, May 06, 2019 at 05:52:48PM +0000, Marciniszyn, Mike wrote: > > > > > > > > My mistake. It's been a long while since I coded the stuff I did for > > > > memory reclaim pressure and I had my flag usage wrong in my memory. > > > > From the description you just gave, the original patch to add > > > > WQ_MEM_RECLAIM is ok. I probably still need to audit the ipoib usage > > > > though. > > > > > > > > > > Don't lose sight of the fact that the additional of the WQ_MEM_RECLAIM is to silence > > > a warning BECAUSE ipoib's workqueue is WQ_MEM_RECLAIM. This happens while > > > rdmavt/hfi1 is doing a cancel_work_sync() for the work item used by the QP's send engine > > > > > > The ipoib wq needs to be audited to see if it is in the data path for VM I/O. > > > > Well, it is doing unsafe memory allocations and other stuff, so it > > can't be RECLAIM. We should just delete them from IPoIB like Doug says. > > Please don't. Well then fix the broken allocations it does, and I don't really see how to do that. We can't have it both ways. I would rather have NFS be broken then normal systems with ipoib hanging during reclaim. Jason