On Tue, Feb 25, 2020 at 10:24:39AM +0100, Jean-Philippe Brucker wrote: > On Mon, Feb 24, 2020 at 03:00:56PM -0400, Jason Gunthorpe wrote: > > On Mon, Feb 24, 2020 at 07:23:36PM +0100, Jean-Philippe Brucker wrote: > > > The new allocation scheme introduced by 2c7933f53f6b ("mm/mmu_notifiers: > > > add a get/put scheme for the registration") provides a convenient way > > > for users to attach notifier data to an mm. However, it would be even > > > better to create this notifier data atomically. > > > > > > Since the alloc_notifier() callback only takes an mm argument at the > > > moment, some users have to perform the allocation in two times. > > > alloc_notifier() initially creates an incomplete structure, which is > > > then finalized using more context once mmu_notifier_get() returns. This > > > second step requires carrying an initialization lock in the notifier > > > data and playing dirty tricks to order memory accesses against live > > > invalidation. > > > > This was the intended pattern. Tthere shouldn't be an real issue as > > there shouldn't be any data on which to invalidate, ie the later patch > > does: > > > > + list_for_each_entry_rcu(bond, &io_mm->devices, mm_head) > > > > And that list is empty post-allocation, so no 'dirty tricks' required. > > Before introducing this patch I had the following code: > > + list_for_each_entry_rcu(bond, &io_mm->devices, mm_head) { > + /* > + * To ensure that we observe the initialization of io_mm fields > + * by io_mm_finalize() before the registration of this bond to > + * the list by io_mm_attach(), introduce an address dependency > + * between bond and io_mm. It pairs with the smp_store_release() > + * from list_add_rcu(). > + */ > + io_mm = rcu_dereference(bond->io_mm); A rcu_dereference isn't need here, just a normal derference is fine. > + io_mm->ops->invalidate(bond->sva.dev, io_mm->pasid, io_mm->ctx, > + start, end - start); > + } > > (1) io_mm_get() would obtain an empty io_mm from iommu_notifier_get(). > (2) then io_mm_finalize() would initialize io_mm->ops, io_mm->ctx, etc. > (3) finally io_mm_attach() would add the bond to io_mm->devices. > > Since the above code can run before (2) it needs to observe valid > io_mm->ctx, io_mm->ops initialized by (2) after obtaining the bond > initialized by (3). Which I believe requires the address dependency from > the rcu_dereference() above or some stronger barrier to pair with the > list_add_rcu(). The list_for_each_entry_rcu() is an acquire that already pairs with the release in list_add_rcu(), all you need is a data dependency chain starting on bond to be correct on ordering. But this is super tricky :\ > If io_mm->ctx and io_mm->ops are already valid before the > mmu notifier is published, then we don't need that stuff. So, this trickyness with RCU is not a bad reason to introduce the priv scheme, maybe explain it in the commit message? Jason