On Wed, Aug 14, 2019 at 03:14:47PM -0700, Andrew Morton wrote: > On Wed, 14 Aug 2019 22:20:23 +0200 Daniel Vetter <daniel.vetter@xxxxxxxx> wrote: > > > Just a bit of paranoia, since if we start pushing this deep into > > callchains it's hard to spot all places where an mmu notifier > > implementation might fail when it's not allowed to. > > > > Inspired by some confusion we had discussing i915 mmu notifiers and > > whether we could use the newly-introduced return value to handle some > > corner cases. Until we realized that these are only for when a task > > has been killed by the oom reaper. > > > > An alternative approach would be to split the callback into two > > versions, one with the int return value, and the other with void > > return value like in older kernels. But that's a lot more churn for > > fairly little gain I think. > > > > Summary from the m-l discussion on why we want something at warning > > level: This allows automated tooling in CI to catch bugs without > > humans having to look at everything. If we just upgrade the existing > > pr_info to a pr_warn, then we'll have false positives. And as-is, no > > one will ever spot the problem since it's lost in the massive amounts > > of overall dmesg noise. > > > > ... > > > > +++ b/mm/mmu_notifier.c > > @@ -179,6 +179,8 @@ int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) > > pr_info("%pS callback failed with %d in %sblockable context.\n", > > mn->ops->invalidate_range_start, _ret, > > !mmu_notifier_range_blockable(range) ? "non-" : ""); > > + WARN_ON(mmu_notifier_range_blockable(range) || > > + ret != -EAGAIN); > > ret = _ret; > > } > > } > > A problem with WARN_ON(a || b) is that if it triggers, we don't know > whether it was because of a or because of b. Or both. So I'd suggest > > WARN_ON(a); > WARN_ON(b); > Well, we did just make a pr_info right above with the value of blockable, that seems enough to tell the cases apart? But you are generally right, the full logic: if (_ret) { if (WARN_ON(mmu_notifier_range_blockable(range))) continue; WARN_ON(_ret != -EAGAIN); ret = -EAGAIN; break; } would force correct API contract up the call chain once we detect a broken driver.. But at some point it does feel like a bit much debugging logic to have in a production code path, as this should never happen and is just to discourage wrong driver behaviors during driver development. If we like this version then: Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx> Also - I have a bunch of other patches to mmu notifiers for hmm.git, so when everyone agrees I can grab this to avoid conflicts. Thanks, Jason