On Wed, Oct 23, 2019 at 12:52:23PM -0400, Jerome Glisse wrote: > > Going another step further.... what hinders us to put the lock into the mmu > > range notifier itself and have _lock()/_unlock() helpers? > > > > I mean having the lock in the driver only makes sense when the driver would > > be using the same lock for multiple things, e.g. multiple MMU range > > notifiers under the same lock. But I really don't see that use case here. > > I actualy do, nouveau use one lock to protect the page table and that's the > lock that matter. You can have multiple range for a single page table, idea > being only a sub-set of the process address space is ever accessed by the > GPU and those it is better to focus on this sub-set and track invalidation in > a finer grain. mlx5 is similar, but not currently coded quite right, there is one lock that protects the command queue for submitting invalidations to the HW and it doesn't make a lot of sense to have additional fine grained locking beyond that. So I suppose the intent here that most drivers would have a single 'page table lock' that protects the HW's page table update, and this lock is the one that should be held while upating and checking the sequence number. dma_fence based drivers are possibly a little different, I think they can just use a spinlock, their pattern should probably be something like fault: hmm_range_fault() spin_lock() if (mmu_range_read_retry())) goto again dma_fence_init(mrn->fence) spin_unlock() invalidate: spin_lock() is_inited = 'dma fence init has been called' spin_unlock() if (is_inited) dma_fence_wait(fence) I'm not sure, never used dma_fence before. The key thing is that the dma_fence_wait() cannot block until after the mmu_range_read_retry() & unlock completes. Otherwise it can deadlock with hmm_range_fault(). It would be nice to figure this out and add it to the hmm.rst as we do have two drivers using the dma_fence scheme. Also, the use of a spinlock here probably says we should keep the lock external. But, it sounds like the mmu_range_notifier_update_seq() is a good idea, so let me add that in v2. Jason