On Mon, 23 Nov 2020 02:37:32 +0000 Justin He <Justin.He@xxxxxxx> wrote: > Hi Alex, thanks for the comments. > See mine below: > > > -----Original Message----- > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > Sent: Friday, November 20, 2020 1:05 AM > > To: Justin He <Justin.He@xxxxxxx> > > Cc: Cornelia Huck <cohuck@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; linux- > > kernel@xxxxxxxxxxxxxxx > > Subject: Re: [PATCH] vfio iommu type1: Bypass the vma permission check in > > vfio_pin_pages_remote() > > > > On Thu, 19 Nov 2020 22:27:37 +0800 > > Jia He <justin.he@xxxxxxx> wrote: > > > > > The permission of vfio iommu is different and incompatible with vma > > > permission. If the iotlb->perm is IOMMU_NONE (e.g. qemu side), qemu will > > > simply call unmap ioctl() instead of mapping. Hence vfio_dma_map() can't > > > map a dma region with NONE permission. > > > > > > This corner case will be exposed in coming virtio_fs cache_size > > > commit [1] > > > - mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > > > memory_region_init_ram_ptr() > > > - re-mmap the above area with read/write authority. > > > - vfio_dma_map() will be invoked when vfio device is hotplug added. > > > > > > qemu: > > > vfio_listener_region_add() > > > vfio_dma_map(..., readonly=false) > > > map.flags is set to VFIO_DMA_MAP_FLAG_READ|VFIO_..._WRITE > > > ioctl(VFIO_IOMMU_MAP_DMA) > > > > > > kernel: > > > vfio_dma_do_map() > > > vfio_pin_map_dma() > > > vfio_pin_pages_remote() > > > vaddr_get_pfn() > > > ... > > > check_vma_flags() failed! because > > > vm_flags hasn't VM_WRITE && gup_flags > > > has FOLL_WRITE > > > > > > It will report error in qemu log when hotplug adding(vfio) a nvme disk > > > to qemu guest on an Ampere EMAG server: > > > "VFIO_MAP_DMA failed: Bad address" > > > > I don't fully understand the argument here, I think this is suggesting > > that because QEMU won't call VFIO_IOMMU_MAP_DMA on a region that has > > NONE permission, the kernel can ignore read/write permission by using > > FOLL_FORCE. Not only is QEMU not the only userspace driver for vfio, > > but regardless of that, we can't trust the behavior of any given > > userspace driver. Bypassing the permission check with FOLL_FORCE seems > > like it's placing the trust in the user, which seems like a security > > issue. Thanks, > Yes, this might have side impact on security. > But besides this simple fix(adding FOLL_FORCE), do you think it is a good > idea that: > Qemu provides a special vfio_dma_map_none_perm() to allow mapping a > region with NONE permission? If NONE permission implies that we use FOLL_FORCE as described here to ignore the r+w permissions and trust that the user knows what they're doing, that seems like a non-starter. Ultimately I think what you're describing is a scenario where our current permission check fails and the solution is probably to extend the check to account for other ways that a user may have access to a vma rather than bypass the check. Thanks, Alex