Re: [PATCH for-rc 1/5] IB/hfi1: Fix WQ_MEM_RECLAIM warning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2019-05-01 at 13:29 -0300, Jason Gunthorpe wrote:
> On Wed, May 01, 2019 at 11:21:08AM -0400, Doug Ledford wrote:
> > On Mar 27, 2019, at 1:02 PM, Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> > > On Wed, Mar 27, 2019 at 08:25:17AM -0700, Tejun Heo (tj@xxxxxxxxxx) wrote:
> > > > Hello,
> > > > 
> > > > On Tue, Mar 26, 2019 at 08:55:09PM +0000, Marciniszyn, Mike wrote:
> > > > > The latter is the ipoib wq that conflicts with our non-WQ_MEM_RECLAIM.  This seems excessive and pretty gratuitous.
> > > > > 
> > > > > Tejun, what does "mem reclaim" really mean and when should it be used?
> > > > 
> > > > That it may be depended during memory reclaim.
> > > > 
> > > > > I was assuming that since we are freeing QP kernel memory held by user mode programs that could be oom killed, we need the flag.
> > > > 
> > > > If it can't block memory reclaim, it doesn't need the flag.  Just in
> > > > case, if a workqueue is used to issue block IOs, it is depended upon
> > > > for memory reclaim as writeback and swap-outs are critical parts of
> > > > memory reclaim.
> > > 
> > > It looks like WQ_MEM_RECLAIM is needed for IPoIB, because if NFS runs
> > > over IPoIB, it will do those types of IOs.
> > 
> > Because of what IPoIB does, you’re right that it’s needed.  However,
> > it might be necessary to audit the wq usage in IPoIB to make sure
> > it’s actually eligible for the flag and that it hasn’t been set when
> > the code doesn’t meet the requirements of the flag.
> 
> It isn't right - it is doing memory allocations from the work queue
> without the GFP_NOIO (or memalloc_noio_save)
> 
> And I'm not sure it can actually tolerate failure of a memory
> allocation anyhow without blocking the dataplane.
> 
> In other words, the entire thing hasn't been designed with the idea
> that it could be on the IO path..
> 
> I'm not sure how things work if NFS is on the critical reclaim path in
> general - does it even work with a normal netdev? How does netdev
> allocate a neighbor for instance if it becomes required?
> 
> Jason

What we probably need to do (but probably not this late in the rc cycle,
save it for next) is remove WQ_MEM_RECLAIM anywhere that an audit isn't
conclusive that it's safe.  As I understand it, the memory subsystem
will explore other areas of memory reclaim if these items don't have the
flag, which is better than the alternative of setting the flag on a work
queue that isn't safe and ending up in a deadlock or other abnormal
failure scenario.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux