On Tue, Mar 02, 2021 at 07:57:58PM +1100, Alistair Popple wrote: > The intent was a driver could use HMM or some other mechanism to keep PTEs > synchronised if required. However I just looked at patch 8 in the series again > and it appears I got this wrong when converting from the old migration > approach: > > + mutex_unlock(&svmm->mutex); > + ret = nouveau_atomic_range_fault(svmm, drm, args, > + size, hmm_flags, mm); > > The mutex needs to be unlocked after the range fault to ensure the PTE hasn't > changed. But this ends up being a problem because try_to_protect() calls > notifiers which need to take that mutex and hence deadlocks. you have to check the notifier sequence under the mutex and loop again. The mutex should only cover programming the HW to use the pages, nothing else. > However try_to_protect() scans the PTEs again under the PTL so checking the > mapping of interest actually gets replaced during the rmap walk seems like a > reasonable solution. Thanks for the comments. It does seem cleaner if you can manage it, the notifier will still be needd to program the HW though Jason