On 6/7/19 5:34 AM, Jason Gunthorpe wrote: > On Thu, Jun 06, 2019 at 07:29:08PM -0700, John Hubbard wrote: >> On 6/6/19 11:44 AM, Jason Gunthorpe wrote: >>> From: Jason Gunthorpe <jgg@xxxxxxxxxxxx> >> ... >>> diff --git a/mm/hmm.c b/mm/hmm.c >>> index 8e7403f081f44a..547002f56a163d 100644 >>> +++ b/mm/hmm.c >> ... >>> @@ -125,7 +130,7 @@ static void hmm_free(struct kref *kref) >>> mm->hmm = NULL; >>> spin_unlock(&mm->page_table_lock); >>> >>> - kfree(hmm); >>> + mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu); >> >> >> It occurred to me to wonder if it is best to use the MMU notifier's >> instance of srcu, instead of creating a separate instance for HMM. > > It *has* to be the MMU notifier SRCU because we are synchornizing > against the read side of that SRU inside the mmu notifier code, ie: > > int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) > id = srcu_read_lock(&srcu); > hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) { > if (mn->ops->invalidate_range_start) { > ^^^^^ > > Here 'mn' is really hmm (hmm = container_of(mn, struct hmm, > mmu_notifier)), so we must protect the memory against free for the mmu > notifier core. > > Thus we have no choice but to use its SRCU. Ah right. It's embarassingly obvious when you say it out loud. :) Thanks for explaining. > > CH also pointed out a more elegant solution, which is to get the write > side of the mmap_sem during hmm_mirror_unregister - no notifier > callback can be running in this case. Then we delete the kref, srcu > and so forth. > > This is much clearer/saner/better, but.. requries the callers of > hmm_mirror_unregister to be safe to get the mmap_sem write side. > > I think this is true, so maybe this patch should be switched, what do > you think? OK, your follow-up notes that we'll leave it as is, got it. thanks, -- John Hubbard NVIDIA