On 19/12/2019 10:28, Leonardo Bras wrote: > On Wed, 2019-12-18 at 15:53 +1100, Alexey Kardashevskiy wrote: >> H_STUFF_TCE is always called with 0. Well, may be some AIX somewhere >> calls it with a value other than zero, and I probably saw some other >> value somewhere but in QEMU/KVM case it is 0 so you effectively disable >> in-kernel acceleration of H_STUFF_TCE which is >> undesirable. >> > > Thanks for the feedback! > >> For now we should disable in-kernel H_STUFF_TCE/... handlers in QEMU >> just like we do for VFIO for older host kernels: >> >> https://git.qemu.org/?p=qemu.git;a=blob;f=hw/ppc/spapr_iommu.c;h=3d3bcc86496a5277d62f7855fbb09c013c015f27;hb=HEAD#l208 > > I am still reading into this temporary solution, I could still not > understand how it works. > >> I am not sure what a proper solution would be, something like an eventfd >> and KVM's kvmppc_h_stuff_tce() signaling vhost that the latter needs to >> invalidate iotlbs. Or we can just say that we do not allow KVM >> acceleration if there is vhost+iommu on the same liobn (== vPHB, pretty >> much). Thanks, > > I am not used to eventfd, but i agree it's a valid solution to talk to > QEMU and then use it to send a message via /dev/vhost. > KVM -> QEMU -> vhost > > But I can't get my mind out of another solution: doing it in > kernelspace. I am not sure how that would work, though. > > If I could understand correctly, there is a vhost IOTLB per vhost_dev, > and H_STUFF_TCE is not called in 64-bit DMA case (for tce_value == 0 > case, at least), which makes sense, given it doesn't need to invalidate > entries on IOTLB. > > So, we would need to somehow replace `return H_TOO_HARD` in this patch > with code that could call vhost_process_iotlb_msg() with > VHOST_IOTLB_INVALIDATE. > > For that, I would need to know what are the vhost_dev's of that > process, which I don't know if it's possible to do currently (or safe > at all). > > I am thinking of linking all vhost_dev's with a list (list.h) that > could be searched, comparing `mm_struct *` of the calling task with all > vhost_dev's, and removing the entry of all IOTLB that hits. > > Not sure if that's the best approach to find the related vhost_dev's. > > What do you think? As discussed in slack, we need to do the same thing we do with physical devices when we invalidate hardware IOMMU translation caches via tbl->it_ops->tce_kill. The problem to solve now is how we tell KVM/PPC about vhost/iotlb (is there an fd?), something similar to the existing KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE. I guess x86 handles all the mappings in QEMU and therefore they do not have this problem. Thanks, -- Alexey