From: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx> Traditionally swiotlb was not performance critical because it was only used for slow devices. But in some setups, like TDX/SEV confidential guests, all IO has to go through swiotlb. Currently swiotlb only has a single lock. Under high IO load with multiple CPUs this can lead to significant lock contention on the swiotlb lock. This patch adds child IO TLB mem support to resolve spinlock overhead among device's queues. Each device may allocate IO tlb mem and setup child IO TLB mem according to queue number. The number child IO tlb mem maybe set up equal with device queue number and this helps to resolve swiotlb spinlock overhead among devices and queues. Patch 2 introduces IO TLB Block concepts and swiotlb_device_allocate() API to allocate per-device swiotlb bounce buffer. The new API Accepts queue number as the number of child IO TLB mem to set up device's IO TLB mem. Tianyu Lan (2): swiotlb: Add Child IO TLB mem support Swiotlb: Add device bounce buffer allocation interface include/linux/swiotlb.h | 40 ++++++ kernel/dma/swiotlb.c | 290 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 317 insertions(+), 13 deletions(-) -- 2.25.1