On 2/19/21 3:32 PM, Konrad Rzeszutek Wilk wrote: > On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote: >> On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote: >>> So one thing that has been on my mind for a while: I'd really like >>> to kill the separate dma ops in Xen swiotlb. If we compare xen-swiotlb >>> to swiotlb the main difference seems to be: >>> >>> - additional reasons to bounce I/O vs the plain DMA capable >>> - the possibility to do a hypercall on arm/arm64 >>> - an extra translation layer before doing the phys_to_dma and vice >>> versa >>> - an special memory allocator >>> >>> I wonder if inbetween a few jump labels or other no overhead enablement >>> options and possibly better use of the dma_range_map we could kill >>> off most of swiotlb-xen instead of maintaining all this code duplication? >> So I looked at this a bit more. >> >> For x86 with XENFEAT_auto_translated_physmap (how common is that?) > Juergen, Boris please correct me if I am wrong, but that XENFEAT_auto_translated_physmap > only works for PVH guests? That's both HVM and PVH (for dom0 it's only PVH). -boris > >> pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is. >> >> xen_arch_need_swiotlb always returns true for x86, and >> range_straddles_page_boundary should never be true for the >> XENFEAT_auto_translated_physmap case. > Correct. The kernel should have no clue of what the real MFNs are > for PFNs. >> So as far as I can tell the mapping fast path for the >> XENFEAT_auto_translated_physmap can be trivially reused from swiotlb. >> >> That leaves us with the next more complicated case, x86 or fully cache >> coherent arm{,64} without XENFEAT_auto_translated_physmap. In that case >> we need to patch in a phys_to_dma/dma_to_phys that performs the MFN >> lookup, which could be done using alternatives or jump labels. >> I think if that is done right we should also be able to let that cover >> the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but >> in that worst case that would need another alternative / jump label. >> >> For non-coherent arm{,64} we'd also need to use alternatives or jump >> labels to for the cache maintainance ops, but that isn't a hard problem >> either. >> >>