On Fri 24-08-18 22:02:23, Tetsuo Handa wrote: > On 2018/08/24 20:36, Michal Hocko wrote: > >> That is, this API seems to be currently used by only out-of-tree users. Since > >> we can't check that nobody has memory allocation dependency, I think that > >> hmm_invalidate_range_start() should return -EAGAIN if blockable == false for now. > > > > The code expects that the invalidate_range_end doesn't block if > > invalidate_range_start hasn't blocked. That is the reason why the end > > callback doesn't have blockable parameter. If this doesn't hold then the > > whole scheme is just fragile because those two calls should pair. > > > That is > > More worrisome part in that patch is that I don't know whether using > trylock if blockable == false at entry is really sufficient. > > . Since those two calls should pair, I think that we need to determine whether > we need to return -EAGAIN at start call by evaluating both calls. Yes, and I believe I have done that audit. Module my misunderstanding of the code. > Like mn_invl_range_start() involves schedule_delayed_work() which could be > blocked on memory allocation under OOM situation, It doesn't because that code path is not invoked for the !blockable case. > I worry that (currently > out-of-tree) users of this API are involving work / recursion. I do not give a slightest about out-of-tree modules. They will have to accomodate to the new API. I have no problems to extend the documentation and be explicit about this expectation. diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 133ba78820ee..698e371aafe3 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -153,7 +153,9 @@ struct mmu_notifier_ops { * * If blockable argument is set to false then the callback cannot * sleep and has to return with -EAGAIN. 0 should be returned - * otherwise. + * otherwise. Please note that if invalidate_range_start approves + * a non-blocking behavior then the same applies to + * invalidate_range_end. * */ int (*invalidate_range_start)(struct mmu_notifier *mn, > And hmm_release() says that > > /* > * Drop mirrors_sem so callback can wait on any pending > * work that might itself trigger mmu_notifier callback > * and thus would deadlock with us. > */ > > and keeps "all operations protected by hmm->mirrors_sem held for write are > atomic". This suggests that "some operations protected by hmm->mirrors_sem held > for read will sleep (and in the worst case involves memory allocation > dependency)". Yes and so what? The clear expectation is that neither of the range notifiers do not sleep in !blocking mode. I really fail to see what you are trying to say. -- Michal Hocko SUSE Labs