Re: [PATCH 2/3] xfs: add kmem_alloc_io()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 22, 2019 at 11:10:57AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 22, 2019 at 10:51:30AM +0200, Peter Zijlstra wrote:
> > On Thu, Aug 22, 2019 at 12:59:48AM -0700, Christoph Hellwig wrote:
> > > On Thu, Aug 22, 2019 at 10:31:32AM +1000, Dave Chinner wrote:
> > > > > Btw, I think we should eventually kill off KM_NOFS and just use
> > > > > PF_MEMALLOC_NOFS in XFS, as the interface makes so much more sense.
> > > > > But that's something for the future.
> > > > 
> > > > Yeah, and it's not quite as simple as just using PF_MEMALLOC_NOFS
> > > > at high levels - we'll still need to annotate callers that use KM_NOFS
> > > > to avoid lockdep false positives. i.e. any code that can be called from
> > > > GFP_KERNEL and reclaim context will throw false positives from
> > > > lockdep if we don't annotate tehm correctly....
> > > 
> > > Oh well.  For now we have the XFS kmem_wrappers to turn that into
> > > GFP_NOFS so we shouldn't be too worried, but I think that is something
> > > we should fix in lockdep to ensure it is generally useful.  I've added
> > > the maintainers and relevant lists to kick off a discussion.
> > 
> > Strictly speaking the fs_reclaim annotation is no longer part of the
> > lockdep core, but is simply a fake lock in page_alloc.c and thus falls
> > under the mm people's purview.
> > 
> > That said; it should be fairly straight forward to teach
> > __need_fs_reclaim() about PF_MEMALLOC_NOFS, much like how it already
> > knows about PF_MEMALLOC.
> 
> Ah, current_gfp_context() already seems to transfer PF_MEMALLOC_NOFS
> into the GFP flags.
> 
> So are we sure it is broken and needs mending?

Well, that's what we are trying to work out. The problem is that we
have code that takes locks and does allocations that is called both
above and below the reclaim "lock" context. Once it's been seen
below the reclaim lock context, calling it with GFP_KERNEL context
above the reclaim lock context throws a deadlock warning.

The only way around that was to mark these allocation sites as
GFP_NOFS so lockdep is never allowed to see that recursion through
reclaim occur. Even though it isn't a deadlock vector.

What we're looking at is whether PF_MEMALLOC_NOFS changes this - I
don't think it does solve this problem. i.e. if we define the
allocation as GFP_KERNEL and then use PF_MEMALLOC_NOFS where reclaim
is not allowed, we still have GFP_KERNEL allocations in code above
reclaim that has also been seen below relcaim. And so we'll get
false positive warnings again.

What I think we are going to have to do here is manually audit
each of the KM_NOFS call sites as we remove the NOFS from them and
determine if ___GFP_NOLOCKDEP is needed to stop lockdep from trying
to track these allocation sites. We've never used this tag because
we'd already fixed most of these false positives with explicit
GFP_NOFS tags long before ___GFP_NOLOCKDEP was created.

But until someone starts doing the work, I don't know if it will
work or even whether conversion PF_MEMALLOC_NOFS is going to
introduce a bunch of new ways to get false positives from lockdep...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux