On Mon, May 06, 2019 at 04:18:44PM -0400, Chuck Lever wrote: > > > > On May 6, 2019, at 4:08 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > On Mon, May 06, 2019 at 10:03:56PM +0300, Leon Romanovsky wrote: > >> On Mon, May 06, 2019 at 03:16:10PM -0300, Jason Gunthorpe wrote: > >>> On Mon, May 06, 2019 at 05:52:48PM +0000, Marciniszyn, Mike wrote: > >>>>> > >>>>> My mistake. It's been a long while since I coded the stuff I did for > >>>>> memory reclaim pressure and I had my flag usage wrong in my memory. > >>>>> From the description you just gave, the original patch to add > >>>>> WQ_MEM_RECLAIM is ok. I probably still need to audit the ipoib usage > >>>>> though. > >>>>> > >>>> > >>>> Don't lose sight of the fact that the additional of the WQ_MEM_RECLAIM is to silence > >>>> a warning BECAUSE ipoib's workqueue is WQ_MEM_RECLAIM. This happens while > >>>> rdmavt/hfi1 is doing a cancel_work_sync() for the work item used by the QP's send engine > >>>> > >>>> The ipoib wq needs to be audited to see if it is in the data path for VM I/O. > >>> > >>> Well, it is doing unsafe memory allocations and other stuff, so it > >>> can't be RECLAIM. We should just delete them from IPoIB like Doug says. > >> > >> Please don't. > > > > Well then fix the broken allocations it does, and I don't really see > > how to do that. We can't have it both ways. > > > > I would rather have NFS be broken then normal systems with ipoib > > hanging during reclaim. > > TBH, NFS on IPoIB is a hack that I would be happy to see replaced > with NFS/RDMA. Does NFS/RDMA even work? Tejun said you can't do a GFP_KERNEL allocation on the writeback progress path. So if the system runs out of memory, and NFS/RDMA is in a state where the connection has glitched and needs to be restarted it needs to go through the whole CM stuff to progress writeback. That stuff uses GFP_KERNEL, so it is not OK. This is where nvme got to when it started to look at this problem. > If you are truly curious, bring this up on linux-nfs to find out > what NFS needs and how it works on Ethernet-only network stacks. I have a feeling the only robust answer here is that NFS can never be on the critical path of reclaim Jason