So one thing that has been on my mind for a while: I'd really like to kill the separate dma ops in Xen swiotlb. If we compare xen-swiotlb to swiotlb the main difference seems to be: - additional reasons to bounce I/O vs the plain DMA capable - the possibility to do a hypercall on arm/arm64 - an extra translation layer before doing the phys_to_dma and vice versa - an special memory allocator I wonder if inbetween a few jump labels or other no overhead enablement options and possibly better use of the dma_range_map we could kill off most of swiotlb-xen instead of maintaining all this code duplication?