(restoring the Cc list that I accidentally removed in my previous reply.) On Mon, 15 May 2023 10:39:11 +0100 Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > Hi Petr, > > On Mon, May 15, 2023 at 10:47:37AM +0200, Petr Tesařík wrote: > > On Sun, 14 May 2023 19:54:27 +0100 > > Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > > On Tue, May 09, 2023 at 11:18:19AM +0200, Petr Tesarik wrote: > > > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > > > > index daa2064f2ede..8cbb0bebb0bc 100644 > > > > --- a/include/linux/swiotlb.h > > > > +++ b/include/linux/swiotlb.h > > > > @@ -152,7 +152,11 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) > > > > > > > > return mem && > > > > (is_swiotlb_fixed(mem, paddr) || > > > > - (mem->allow_dyn && is_swiotlb_dyn(dev, paddr))); > > > > + /* Pairs with smp_store_release() in swiotlb_dyn_map() > > > > + * and swiotlb_dyn_unmap(). > > > > + */ > > > > + (smp_load_acquire(&dev->dma_io_tlb_have_dyn) && > > > > + is_swiotlb_dyn(dev, paddr))); > > > > } > > > > > > > > static inline bool is_swiotlb_force_bounce(struct device *dev) > > > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > > > > index 81eab1c72c50..e8be3ee50f18 100644 > > > > --- a/kernel/dma/swiotlb.c > > > > +++ b/kernel/dma/swiotlb.c > > > > @@ -642,6 +642,9 @@ static phys_addr_t swiotlb_dyn_map(struct device *dev, phys_addr_t orig_addr, > > > > > > > > spin_lock_irqsave(&dev->dma_io_tlb_dyn_lock, flags); > > > > list_add(&slot->node, &dev->dma_io_tlb_dyn_slots); > > > > + if (!dev->dma_io_tlb_have_dyn) > > > > + /* Pairs with smp_load_acquire() in is_swiotlb_buffer() */ > > > > + smp_store_release(&dev->dma_io_tlb_have_dyn, true); > > > > spin_unlock_irqrestore(&dev->dma_io_tlb_dyn_lock, flags); > > > > > > I'm not sure this works. What this seems to do is that if the caller of > > > is_swiotlb_buffer() sees the flag set, it's guaranteed that something > > > was added to the dma_io_tlb_dyn_slots list. But the reverse is not > > > necessarily true. IOW, if something was added to the list, there is a > > > brief window where the dma_io_tlb_have_dyn flag is still false. In the > > > general case, I doubt any ordering between list_add() and the flag > > > setting changes anything since neither of them may be visible to another > > > CPU. > > > > Thank you for the review! This patch probably needs a bit more > > explanation. > > > > The goal is to avoid taking a spin lock in the mkost common case that > > the dynamic list is empty. The only required invariant is: > > > > When the flag is clear, it is safe to skip the list. > > > > It's not a bug to walk an empty list, it's merely less efficient. Such > > race window would be acceptable. OTOH that's not your concern if I > > understand you correctly. > > I got what the patch aims to do. What I meant is that your proposed > invariant (flag == false => list_empty()) doesn't hold. The list becomes > non-empty with list_add() but the flag is set after. For this to work, > you'd need to set the flag prior to list_add(). > > However, even if you fix this invariant, I don't think it helps because > the ordering isn't between a list update and the flag but rather a list > update and the actual paddr visibility you want to look up in the list. > > > > What you need is for a 'paddr' added to the dynamic list to be correctly > > > identified by another CPU as dynamic swiotlb. That other CPU doesn't > > > check random addresses but only those returned by the DMA API. > > > > Yes, that's correct. > > > > > Such > > > values would be normally passed through a memory location (e.g. driver > > > local structures) and that's what you want to order against. > > > > This would certainly work, but I'm not sure I need it. My only goal is > > that when the flag is set, the new value is observed by all CPUs on the > > next call to is_swiotlb_buffer(). > > Which value? paddr? That's not guaranteed with your current ordering. I was thinking the flag value, but... > Let's say P0 does: > > list_add(paddr); > smp_store_release(&flag, true); > WRITE_ONCE(blah, paddr); // using *_ONCE for clarity > > On P1: > > paddr = READ_ONCE(blah); > if (smp_load_acquire(&flag)) { > // check the list etc. > } ... I see now, the problem is that smp_store_release() on P0 may happen after smp_load_acquire() on P1, while paddr was already visible (not through the list). Thank you for patience. I was too focused on the linked list. >[...] > > Wait, let me check that I understand you right. Do you suggest that I > > convert dma_io_tlb_dyn_slots to a lockless list and get rid of the > > spinlock? > > I don't mind the spinlock. It feels safer to keep it ;) (and you do need > for list updates anyway). Using RCU gets too complicated TBH, I'd rather > keep it simple, it's not on an extremely critical path. > > What I believe is that you can get rid of the flag, and just rely on > probing list_empty() locklessly. If that's false, you take the lock and > search the list again. Basically dev->dma_io_tlb_syn_slots.next is your > flag, no need for an additional one (just the explicit smp_*mb() > barriers as I mentioned above). Since we don't control how drivers store the physical address (to give them release semantics), the check cannot be performed without a read memory barrier (aka "dsb" on arm64). Okay... still better than the spinlock. Petr T