On 03/16/2018 07:36 PM, John Hubbard wrote: > On 03/16/2018 12:14 PM, jglisse@xxxxxxxxxx wrote: >> From: Ralph Campbell <rcampbell@xxxxxxxxxx> >> > > <snip> > >> +static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) >> +{ >> + struct hmm *hmm = mm->hmm; >> + struct hmm_mirror *mirror; >> + struct hmm_mirror *mirror_next; >> + >> + down_write(&hmm->mirrors_sem); >> + list_for_each_entry_safe(mirror, mirror_next, &hmm->mirrors, list) { >> + list_del_init(&mirror->list); >> + if (mirror->ops->release) >> + mirror->ops->release(mirror); >> + } >> + up_write(&hmm->mirrors_sem); >> +} >> + > > OK, as for actual code review: > > This part of the locking looks good. However, I think it can race against > hmm_mirror_register(), because hmm_mirror_register() will just add a new > mirror regardless. > > So: > > thread 1 thread 2 > -------------- ----------------- > hmm_release hmm_mirror_register > down_write(&hmm->mirrors_sem); <blocked: waiting for sem> > // deletes all list items > up_write > unblocked: adds new mirror > > > ...so I think we need a way to back out of any pending hmm_mirror_register() > calls, as part of the .release steps, right? It seems hard for the device driver, > which could be inside of hmm_mirror_register(), to handle that. Especially considering > that right now, hmm_mirror_register() will return success in this case--so > there is no indication that anything is wrong. > > Maybe hmm_mirror_register() could return an error (and not add to the mirror list), > in such a situation, how's that sound? > In other words, I think this would help (not tested yet beyond a quick compile, but it's pretty simple): diff --git a/mm/hmm.c b/mm/hmm.c index 7ccca5478ea1..da39f8522dca 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -66,6 +66,7 @@ struct hmm { struct list_head mirrors; struct mmu_notifier mmu_notifier; struct rw_semaphore mirrors_sem; + bool shutting_down; }; /* @@ -99,6 +100,7 @@ static struct hmm *hmm_register(struct mm_struct *mm) INIT_LIST_HEAD(&hmm->ranges); spin_lock_init(&hmm->lock); hmm->mm = mm; + hmm->shutting_down = false; /* * We should only get here if hold the mmap_sem in write mode ie on @@ -167,6 +169,7 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) struct hmm_mirror *mirror_next; down_write(&hmm->mirrors_sem); + hmm->shutting_down = true; list_for_each_entry_safe(mirror, mirror_next, &hmm->mirrors, list) { list_del_init(&mirror->list); if (mirror->ops->release) @@ -227,6 +230,10 @@ int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm) return -ENOMEM; down_write(&mirror->hmm->mirrors_sem); + if (mirror->hmm->shutting_down) { + up_write(&mirror->hmm->mirrors_sem); + return -ESRCH; + } list_add(&mirror->list, &mirror->hmm->mirrors); up_write(&mirror->hmm->mirrors_sem); thanks, -- John Hubbard NVIDIA