Intent of this post: Seek reviews from Intel reviewers and anyone else in the list interested in IO performance in confidential VMs. Need some acked-by reviewed-by tags before I can add swiotlb maintainers to "to/cc" lists and ask for a review from them. swiotlb is now widely used by confidential VMs. This series optimizes swiotlb to reduce cache misses and lock contention during bounce buffer allocation/free and memory bouncing to improve IO workload performance in confidential VMs. Here are some FIO tests we did to demonstrate the improvement. Test setup ---------- A normal VM with 8vCPU and 32G memory, swiotlb is enabled by swiotlb=force. 100 in Host/Guest CPU utilization means 1 logical processor. FIO block size is 4K and iodepth is 256. Note that a normal VM is used so that others lack of necessary hardware to host confidential VMs can reproduce results below. Results ------- 1 FIO job read/write Throughput IOPS Host CPU Guest CPU (MB/s) (k) utilization utilization vanilla read 1037 253 228.48 101.92 write 1148 280 233.28 100.96 optimized read 1160 283 232.32 101.12 write 1195 292 233.28 100.64 1-job FIO sequential read/write perf increase by 12% and 4% respectively. 4 FIO jobs read/write Throughput IOPS Host CPU Guest CPU (MB/s) (k) utilization utilization vanilla read 885 214.9 527.04 401.12 write 868 212.1 531.84 400.64 optimized read 2320 567 344.64 202.8 write 1998 488 312 173.92 4-job FIO sequential read/write perf increase by 164% and 130% respectively. This series is based on 5.19-rc2. Andi Kleen (1): swiotlb: Split up single swiotlb lock Chao Gao (2): swiotlb: Use bitmap to track free slots swiotlb: Allocate memory in a cache-friendly way .../admin-guide/kernel-parameters.txt | 4 +- arch/x86/kernel/acpi/boot.c | 4 + include/linux/swiotlb.h | 47 +++- kernel/dma/swiotlb.c | 263 +++++++++++++----- 4 files changed, 229 insertions(+), 89 deletions(-) -- 2.25.1