On 2022-04-28 15:55, Andi Kleen wrote:
On 4/28/2022 7:45 AM, Christoph Hellwig wrote:
On Thu, Apr 28, 2022 at 03:44:36PM +0100, Robin Murphy wrote:
Rather than introduce this extra level of allocator complexity, how
about
just dividing up the initial SWIOTLB allocation into multiple io_tlb_mem
instances?
Yeah. We're almost done removing all knowledge of swiotlb from drivers,
so the very last thing I want is an interface that allows a driver to
allocate a per-device buffer.
At least for TDX need parallelism with a single device for performance.
So if you split up the io tlb mems for a device then you would need a
new mechanism to load balance the requests for single device over those.
I doubt it would be any simpler.
Eh, I think it would be, since the round-robin retry loop can then just
sit around the existing io_tlb_mem-based allocator, vs. the churn of
inserting it in the middle, plus it's then really easy to statically
distribute different starting points across different devices via
dev->dma_io_tlb_mem if we wanted to.
Admittedly the overall patch probably ends up about the same size, since
it likely pushes a bit more complexity into swiotlb_init to compensate,
but that's still a trade-off I like.
Thanks,
Robin.