On Mon, Feb 24, 2020 at 03:00:56PM -0400, Jason Gunthorpe wrote: > On Mon, Feb 24, 2020 at 07:23:36PM +0100, Jean-Philippe Brucker wrote: > > The new allocation scheme introduced by 2c7933f53f6b ("mm/mmu_notifiers: > > add a get/put scheme for the registration") provides a convenient way > > for users to attach notifier data to an mm. However, it would be even > > better to create this notifier data atomically. > > > > Since the alloc_notifier() callback only takes an mm argument at the > > moment, some users have to perform the allocation in two times. > > alloc_notifier() initially creates an incomplete structure, which is > > then finalized using more context once mmu_notifier_get() returns. This > > second step requires carrying an initialization lock in the notifier > > data and playing dirty tricks to order memory accesses against live > > invalidation. > > This was the intended pattern. Tthere shouldn't be an real issue as > there shouldn't be any data on which to invalidate, ie the later patch > does: > > + list_for_each_entry_rcu(bond, &io_mm->devices, mm_head) > > And that list is empty post-allocation, so no 'dirty tricks' required. Before introducing this patch I had the following code: + list_for_each_entry_rcu(bond, &io_mm->devices, mm_head) { + /* + * To ensure that we observe the initialization of io_mm fields + * by io_mm_finalize() before the registration of this bond to + * the list by io_mm_attach(), introduce an address dependency + * between bond and io_mm. It pairs with the smp_store_release() + * from list_add_rcu(). + */ + io_mm = rcu_dereference(bond->io_mm); + io_mm->ops->invalidate(bond->sva.dev, io_mm->pasid, io_mm->ctx, + start, end - start); + } (1) io_mm_get() would obtain an empty io_mm from iommu_notifier_get(). (2) then io_mm_finalize() would initialize io_mm->ops, io_mm->ctx, etc. (3) finally io_mm_attach() would add the bond to io_mm->devices. Since the above code can run before (2) it needs to observe valid io_mm->ctx, io_mm->ops initialized by (2) after obtaining the bond initialized by (3). Which I believe requires the address dependency from the rcu_dereference() above or some stronger barrier to pair with the list_add_rcu(). If io_mm->ctx and io_mm->ops are already valid before the mmu notifier is published, then we don't need that stuff. That's the main reason I would have liked moving everything to alloc_notifier(), the locking below isn't a big deal. > The other op callback is release, which also cannot be called as the > caller must hold a mmget to establish the notifier. > > So just use the locking that already exists. There is one function > that calls io_mm_get() which immediately calls io_mm_attach, which > immediately grabs the global iommu_sva_lock. > > Thus init the pasid for the first time under that lock and everything > is fine. I agree with this, can't remember why I used a separate lock for initialization rather than reusing iommu_sva_lock. Thanks, Jean > > There is nothing inherently wrong with the approach in this patch, but > it seems unneeded in this case.. > > Jason