On Tue, Jan 15, 2019 at 09:37:42AM +0100, Joerg Roedel wrote: > On Mon, Jan 14, 2019 at 01:20:45PM -0500, Michael S. Tsirkin wrote: > > Which would be fine especially if we can manage not to introduce a bunch > > of indirect calls all over the place and hurt performance. > > Which indirect calls? In case of unset dma_ops the DMA-API functions > call directly into the dma-direct implementation, no indirect calls at > all. True. But the NULL-ops dma direct case still has two issues that might not work for virtio: (a) if the architecture supports devices that are not DMA coherent it will dip into a special allocator and do cache maintainance for streaming mappings. Although it would be a bit of a hack we could work around this in virtio doings something like: #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \ defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) dev->dma_coherent = true; #endif except that won't work for mips, which has a mode where it does a system instead of device level coherency flag and thus doesn't look at this struct device field (b) we can still mangle the DMA address, either using the dma_pfn_offset field in struct device, or by a full override of __phys_to_dma / __dma_to_phys by the architecture code. The first could be solved using a hack like the one above, but the latter would be a little harder. In the long run I'd love to get rid of that hook and have everyone use the generic offset code, but for that we first need to support multiple ranges with different offset and do quite some nasty arch code surgery. So for the legacy virtio case I fear we need to keep local dma mapping implementation for now. I just wish now recent hypervisor would ever offer devices in this broken legacy mode.. > > Regards, > > Joerg ---end quoted text---