On Fri, 23 Aug 2024 20:41:15 +0000 Michael Kelley <mhklinux@xxxxxxxxxxx> wrote: > From: Petr Tesařík <petr@xxxxxxxxxxx> Sent: Friday, August 23, 2024 12:41 AM > > > > On Thu, 22 Aug 2024 11:37:12 -0700 > > mhkelley58@xxxxxxxxx wrote: >[...] > > > @@ -71,12 +72,15 @@ > > > * from each index. > > > * @pad_slots: Number of preceding padding slots. Valid only in the first > > > * allocated non-padding slot. > > > + * @throttled: Boolean indicating the slot is used by a request that was > > > + * throttled. Valid only in the first allocated non-padding slot. > > > */ > > > struct io_tlb_slot { > > > phys_addr_t orig_addr; > > > size_t alloc_size; > > > unsigned short list; > > > - unsigned short pad_slots; > > > + u8 pad_slots; > > > + u8 throttled; > > > > I'm not sure this flag is needed for each slot. > > > > SWIOTLB mappings should be throttled when the total SWIOTLB usage is > > above a threshold. Conversely, it can be unthrottled when the total > > usage goes below a threshold, and it should not matter if that happens > > due to an unmap of the exact buffer which previously pushed the usage > > over the edge, or due to an unmap of any other unrelated buffer. > > I think I understand what you are proposing. But I don't see a way > to make it work without adding global synchronization beyond > the current atomic counter for the number of uI'm sed slabs. At a minimum > we would need a global spin lock instead of the atomic counter. The spin > lock would protect the (non-atomic) slab count along with some other > accounting, and that's more global references. As described in the > cover letter, I was trying to avoid doing that. I have thought about this for a few days. And I'm still not convinced. You have made it clear in multiple places that the threshold is a soft limit, and there are many ways the SWIOTLB utilization may exceed the threshold. In fact I'm not even 100% sure that an atomic counter is needed, because the check is racy anyway. Another task may increase (or decrease) the counter between atomic_long_read(&mem->total_used) and a subsequent down(&mem->throttle_sem). I consider it a feature, not a flaw, because the real important checks happen later while searching for free slots, and those are protected with a spinlock. > If you can see how to do what you propose with just the current > atomic counter, please describe. I think I'm certainly missing something obvious, but let me open the discussion to improve my understanding of the matter. Suppose we don't protect the slab count with anything. What is the worst possible outcome? IIUC the worst scenario is that multiple tasks unmap swiotlb buffers simultaneously and all of them believe that their action made the total usage go below the low threshold, so all of them try to release the semaphore. That's obviously not good, but AFAICS all that's needed is a test_and_clear_bit() on a per-io_tlb_mem throttled flag just before calling up(). Since up() would acquire the semaphore's spinlock, and there's only one semaphore per io_tlb_mem, adding an atomic flag doesn't look like too much overhead to me, especially if it ends up in the same cache line as the semaphore. Besides, this all happens only in the slow path, i.e. only after the current utilization has just dropped below the low threshold, not if the utilization was already below the threshold before freeing up some slots. I have briefly considered subtracting the low threshold as initial bias from mem->total_used and using atomic_long_add_negative() to avoid the need for an extra throttled flag, but at this point I'm not sure it can be implemented without any races. We can try to figure out the details if it sounds like a good idea. Petr T