On Wed, Oct 23, 2019 at 11:32:16AM +0200, Christian König wrote: > Am 23.10.19 um 11:08 schrieb Daniel Vetter: > > On Tue, Oct 22, 2019 at 03:01:13PM +0000, Jason Gunthorpe wrote: > > > On Tue, Oct 22, 2019 at 09:57:35AM +0200, Daniel Vetter wrote: > > > > > > > > The unusual bit in all of this is using a lock's critical region to > > > > > 'protect' data for read, but updating that same data before the lock's > > > > > critical secion. ie relying on the unlock barrier to 'release' program > > > > > ordered stores done before the lock's own critical region, and the > > > > > lock side barrier to 'acquire' those stores. > > > > I think this unusual use of locks as barriers for other unlocked accesses > > > > deserves comments even more than just normal barriers. Can you pls add > > > > them? I think the design seeems sound ... > > > > > > > > Also the comment on the driver's lock hopefully prevents driver > > > > maintainers from moving the driver_lock around in a way that would very > > > > subtle break the scheme, so I think having the acquire barrier commented > > > > in each place would be really good. > > > There is already a lot of documentation, I think it would be helpful > > > if you could suggest some specific places where you think an addition > > > would help? I think the perspective of someone less familiar with this > > > design would really improve the documentation > > Hm I just meant the usual recommendation that "barriers must have comments > > explaining what they order, and where the other side of the barrier is". > > Using unlock/lock as a barrier imo just makes that an even better idea. > > Usually what I do is something like "we need to order $this against $that > > below, and the other side of this barrier is in function()." With maybe a > > bit more if it's not obvious how things go wrong if the orderin is broken. > > > > Ofc seqlock.h itself skimps on that rule and doesn't bother explaining its > > barriers :-/ > > > > > I've been tempted to force the driver to store the seq number directly > > > under the driver lock - this makes the scheme much clearer, ie > > > something like this: > > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c > > > index 712c99918551bc..738fa670dcfb19 100644 > > > --- a/drivers/gpu/drm/nouveau/nouveau_svm.c > > > +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c > > > @@ -488,7 +488,8 @@ struct svm_notifier { > > > }; > > > static bool nouveau_svm_range_invalidate(struct mmu_range_notifier *mrn, > > > - const struct mmu_notifier_range *range) > > > + const struct mmu_notifier_range *range, > > > + unsigned long seq) > > > { > > > struct svm_notifier *sn = > > > container_of(mrn, struct svm_notifier, notifier); > > > @@ -504,6 +505,7 @@ static bool nouveau_svm_range_invalidate(struct mmu_range_notifier *mrn, > > > mutex_lock(&sn->svmm->mutex); > > > else if (!mutex_trylock(&sn->svmm->mutex)) > > > return false; > > > + mmu_range_notifier_update_seq(mrn, seq); > > > mutex_unlock(&sn->svmm->mutex); > > > return true; > > > } > > > > > > > > > At the cost of making the driver a bit more complex, what do you > > > think? > > Hm, spinning this further ... could we initialize the mmu range notifier > > with a pointer to the driver lock, so that we could put a > > lockdep_assert_held into mmu_range_notifier_update_seq? I think that would > > make this scheme substantially more driver-hacker proof :-) > > Going another step further.... what hinders us to put the lock into the mmu > range notifier itself and have _lock()/_unlock() helpers? > > I mean having the lock in the driver only makes sense when the driver would > be using the same lock for multiple things, e.g. multiple MMU range > notifiers under the same lock. But I really don't see that use case here. I actualy do, nouveau use one lock to protect the page table and that's the lock that matter. You can have multiple range for a single page table, idea being only a sub-set of the process address space is ever accessed by the GPU and those it is better to focus on this sub-set and track invalidation in a finer grain. Cheers, Jérôme