Re: [PATCH for-rc 1/5] IB/hfi1: Fix WQ_MEM_RECLAIM warning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 06, 2019 at 04:18:44PM -0400, Chuck Lever wrote:
> 
> 
> > On May 6, 2019, at 4:08 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > 
> > On Mon, May 06, 2019 at 10:03:56PM +0300, Leon Romanovsky wrote:
> >> On Mon, May 06, 2019 at 03:16:10PM -0300, Jason Gunthorpe wrote:
> >>> On Mon, May 06, 2019 at 05:52:48PM +0000, Marciniszyn, Mike wrote:
> >>>>> 
> >>>>> My mistake.  It's been a long while since I coded the stuff I did for
> >>>>> memory reclaim pressure and I had my flag usage wrong in my memory.
> >>>>> From the description you just gave, the original patch to add
> >>>>> WQ_MEM_RECLAIM is ok.  I probably still need to audit the ipoib usage
> >>>>> though.
> >>>>> 
> >>>> 
> >>>> Don't lose sight of the fact that the additional of the WQ_MEM_RECLAIM is to silence
> >>>> a warning BECAUSE ipoib's workqueue is WQ_MEM_RECLAIM.  This happens while
> >>>> rdmavt/hfi1 is doing a cancel_work_sync() for the work item used by the QP's send engine
> >>>> 
> >>>> The ipoib wq needs to be audited to see if it is in the data path for VM I/O.
> >>> 
> >>> Well, it is doing unsafe memory allocations and other stuff, so it
> >>> can't be RECLAIM. We should just delete them from IPoIB like Doug says.
> >> 
> >> Please don't.
> > 
> > Well then fix the broken allocations it does, and I don't really see
> > how to do that. We can't have it both ways.
> > 
> > I would rather have NFS be broken then normal systems with ipoib
> > hanging during reclaim.
> 
> TBH, NFS on IPoIB is a hack that I would be happy to see replaced
> with NFS/RDMA.

Does NFS/RDMA even work? Tejun said you can't do a GFP_KERNEL
allocation on the writeback progress path.

So if the system runs out of memory, and NFS/RDMA is in a state where
the connection has glitched and needs to be restarted it needs to go
through the whole CM stuff to progress writeback. That stuff uses
GFP_KERNEL, so it is not OK.

This is where nvme got to when it started to look at this problem.

> If you are truly curious, bring this up on linux-nfs to find out
> what NFS needs and how it works on Ethernet-only network stacks.

I have a feeling the only robust answer here is that NFS can never be
on the critical path of reclaim

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux