Re: [PATCH 03/14] mm/hmm: HMM should have a callback before MM is destroyed v2

John Hubbard <jhubbard@xxxxxxxxxx> · Fri, 16 Mar 2018 21:39:00 -0700

On 03/16/2018 08:47 PM, John Hubbard wrote:
> On 03/16/2018 07:36 PM, John Hubbard wrote:
>> On 03/16/2018 12:14 PM, jglisse@xxxxxxxxxx wrote:
>>> From: Ralph Campbell <rcampbell@xxxxxxxxxx>
>>>
>>
>> <snip>
>>
>>> +static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
>>> +{
>>> +	struct hmm *hmm = mm->hmm;
>>> +	struct hmm_mirror *mirror;
>>> +	struct hmm_mirror *mirror_next;
>>> +
>>> +	down_write(&hmm->mirrors_sem);
>>> +	list_for_each_entry_safe(mirror, mirror_next, &hmm->mirrors, list) {
>>> +		list_del_init(&mirror->list);
>>> +		if (mirror->ops->release)
>>> +			mirror->ops->release(mirror);
>>> +	}
>>> +	up_write(&hmm->mirrors_sem);
>>> +}
>>> +
>>
>> OK, as for actual code review:
>>
>> This part of the locking looks good. However, I think it can race against
>> hmm_mirror_register(), because hmm_mirror_register() will just add a new 
>> mirror regardless.
>>
>> So:
>>
>> thread 1                                      thread 2
>> --------------                                -----------------
>> hmm_release                                   hmm_mirror_register 
>>     down_write(&hmm->mirrors_sem);                <blocked: waiting for sem>
>>         // deletes all list items
>>     up_write
>>                                                   unblocked: adds new mirror
>>                                               
>>

Mark Hairgrove just pointed out some more fun facts:

1. Because hmm_mirror_register() needs to be called with an mm that has a non-zero
refcount, you generally cannot get an hmm_release callback, so the above race should
not happen.

2. We looked around, and the code is missing a call to mmu_notifier_unregister().
That means that it is going to leak memory and not let the mm get released either.

Maybe having each mirror have its own mmu notifier callback is a possible way
to solve this.

thanks,
-- 
John Hubbard
NVIDIA