Re: [PATCH 2/5] kernel.h: Add non_block_start/end()

Jason Gunthorpe <jgg@xxxxxxxx> · Thu, 15 Aug 2019 09:23:44 -0300

On Thu, Aug 15, 2019 at 08:58:29AM +0200, Daniel Vetter wrote:
> On Wed, Aug 14, 2019 at 08:58:05PM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 14, 2019 at 10:20:24PM +0200, Daniel Vetter wrote:
> > > In some special cases we must not block, but there's not a
> > > spinlock, preempt-off, irqs-off or similar critical section already
> > > that arms the might_sleep() debug checks. Add a non_block_start/end()
> > > pair to annotate these.
> > > 
> > > This will be used in the oom paths of mmu-notifiers, where blocking is
> > > not allowed to make sure there's forward progress. Quoting Michal:
> > > 
> > > "The notifier is called from quite a restricted context - oom_reaper -
> > > which shouldn't depend on any locks or sleepable conditionals. The code
> > > should be swift as well but we mostly do care about it to make a forward
> > > progress. Checking for sleepable context is the best thing we could come
> > > up with that would describe these demands at least partially."
> > 
> > But this describes fs_reclaim_acquire() - is there some reason we are
> > conflating fs_reclaim with non-sleeping?
> 
> No idea why you tie this into fs_reclaim. We can definitly sleep in there,
> and for e.g. kswapd (which also wraps everything in fs_reclaim) we're
> event supposed to I thought. To make sure we can get at the last bit of
> memory by flushing all the queues and waiting for everything to be cleaned
> out.

AFAIK the point of fs_reclaim is to prevent "indirect dependency upon
the page allocator" ie a justification that was given this !blockable
stuff.

For instance:

  fs_reclaim_acquire()
  kmalloc(GFP_KERNEL) <- lock dep assertion

And further, Michal's concern about indirectness through locks is also
handled by lockdep:

       CPU0                                 CPU1
                                        mutex_lock()
                                        kmalloc(GFP_KERNEL)
                                        mutex_unlock()
  fs_reclaim_acquire()
  mutex_lock() <- lock dep assertion

In other words, to prevent recursion into the page allocator you use
fs_reclaim_acquire(), and lockdep verfies it in its usual robust way.

I asked Tejun about this once in regards to WQ_MEM_RECLAIM and he
explained that it means you can't call the allocator functions in a
way that would recurse into reclaim (ie instead use instead GFP_ATOMIC, or
tolerate allocation failure, or various other things).

So, the reason I bring it up is half the justifications you posted for
blockable had to do with not recursing into reclaim and deadlocking,
and didn't seem to have much to do with blocking.

I'm asking if *non-blocking* is really the requirement or if this is
just the usual 'do not deadlock on the allocator' thing reclaim paths
alread have?

Jason