On Thu 02-06-16 17:11:16, Peter Zijlstra wrote: > On Thu, Jun 02, 2016 at 04:50:49PM +0200, Michal Hocko wrote: > > On Wed 01-06-16 20:16:17, Peter Zijlstra wrote: > > > > So my favourite is the dedicated GFP flag, but if that's unpalatable for > > > the mm folks then something like the below might work. It should be > > > similar in effect to your proposal, except its more limited in scope. > > [...] > > > @@ -2876,11 +2883,36 @@ static void __lockdep_trace_alloc(gfp_t gfp_mask, unsigned long flags) > > > if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))) > > > return; > > > > > > + /* > > > + * Skip _one_ allocation as per the lockdep_skip_alloc() request. > > > + * Must be done last so that we don't loose the annotation for > > > + * GFP_ATOMIC like things from IRQ or other nesting contexts. > > > + */ > > > + if (current->lockdep_reclaim_gfp & __GFP_SKIP_ALLOC) { > > > + current->lockdep_reclaim_gfp &= ~__GFP_SKIP_ALLOC; > > > + return; > > > + } > > > + > > > mark_held_locks(curr, RECLAIM_FS); > > > } > > > > I might be missing something but does this work actually? Say you would > > want a kmalloc(size), it would call > > slab_alloc_node > > slab_pre_alloc_hook > > lockdep_trace_alloc > > [...] > > ____cache_alloc_node > > cache_grow_begin > > kmem_getpages > > __alloc_pages_node > > __alloc_pages_nodemask > > lockdep_trace_alloc > > Bugger :/ You're right, that would fail. > > So how about doing: > > #define __GFP_NOLOCKDEP (1u << __GFP_BITS_SHIFT) Hmm, now that I looked closer this would break GFP_SLAB_BUG_MASK :/ The whole thing is a bit hysterical because I really do not see any reason to blow up just because somebody has used incorrect gfp mask (we have users who give us combinations without any sense in the tree...) We can fix that either by dropping the whole GFP_SLAB_BUG_MASK thingy or to update it with __GFP_NOLOCKDEP. It just shows how this might get really tricky and subtle. > this means it cannot be part of address_space::flags or > radix_tree_root::gfp_mask, but that might not be a bad thing. True, those shouldn't really care. > And this solves the scarcity thing, because per pagemap we need to have > 5 'spare' bits anyway. > > > I understand your concerns about the scope but usually all allocations > > have to be __GFP_NOFS or none in the same scope so I would see it as a > > huge deal. > > With scope I mostly meant the fact that you have two calls that you need > to pair up. That's not really nice as you can 'annotate' a _lot_ of code > in between. I prefer the narrower annotations where you annotate a > single specific site. Yes, I can see you point. What I meant to say is that we would most probably end up with the following pattern lockdep_trace_alloc_enable() some_foo_with_alloc(gfp_mask); lockdep_trace_alloc_disable() and some_foo_with_alloc might be a lot of code. But at the same time we know that _any_ allocation done from that context is safe from the reclaim recursiveness POV. If not then annotation is buggy and needs to be done at a different level but that would be exactly same if we did some_foo_with_alloc(gfp_mask|__GFP_NOLOCKDEP) because all the allocations down that road would reuse the same gfp mask anyway. That being said I completely agree that a single entry point is much less error prone but it also is tricky as we can see. So I would rather go with something less tricky. It's not like people are not used to enable/disable pattern. Anyway I will leave the decision to you. If you really insist on __GFP_NOLOCKDEP which doesn't consume new flag then I can review the resulting patch. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>