Hi Christoph, On Wed, 17 May 2023 08:56:53 +0200 Christoph Hellwig <hch@xxxxxx> wrote: > Just thinking out loud: > > - what if we always way overallocate the swiotlb buffer > - and then mark the second half / two thirds / <pull some number out > of the thin air> slots as used, and make that region available > through a special CMA mechanism as ZONE_MOVABLE (but not allowing > other CMA allocations to dip into it). This approach has also been considered internally at Huawei, and it looked like a viable option, just more complex. We decided to send the simple approach first to get some feedback and find out who else might be interested in the dynamic sizing of swiotlb (if anyone). > This allows us to have a single slot management for the entire > area, but allow reclaiming from it. We'd probably also need to make > this CMA variant irq safe. Let me recap my internal analysis. On the pro side: - no performance penalty for devices that do not use swiotlb - all alignment and boundary constraints can be met - efficient use of memory for buffers smaller than 1 page On the con side: - ZONE_MOVABLE cannot be used for most kernel allocations - competition with CMA over precious physical address space (How much should be reserved for CMA and how much for SWIOTLB?) To quote from Memory hotplug documentation: Usually, MOVABLE:KERNEL ratios of up to 3:1 or even 4:1 are fine. [...] Actual safe zone ratios depend on the workload. Extreme cases, like excessive long-term pinning of pages, might not be able to deal with ZONE_MOVABLE at all. This should be no big issue on bare metal (where the motivation is addressing limitations), but the size of SWIOTLB in CoCo VMs probably needs some consideration. > This could still be combined with more aggressive use of per-device > swiotlb area, which is probably a good idea based on some hints. > E.g. device could hint an amount of inflight DMA to the DMA layer, > and if there are addressing limitations and the amout is large enough > that could cause the allocation of a per-device swiotlb area. I would not rely on device hints, because it probably depends on workload rather than type of device. I'd rather implement some logic based on the actual runtime usage pattern. I have some ideas already. Petr T