On 3/20/2020 9:17 PM, Alex Williamson wrote:
On Fri, 20 Mar 2020 09:40:39 -0600
Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:
On Fri, 20 Mar 2020 04:35:29 -0400
Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote:
On Thu, Mar 19, 2020 at 03:41:12AM +0800, Kirti Wankhede wrote:
DMA mapped pages, including those pinned by mdev vendor drivers, might
get unpinned and unmapped while migration is active and device is still
running. For example, in pre-copy phase while guest driver could access
those pages, host device or vendor driver can dirty these mapped pages.
Such pages should be marked dirty so as to maintain memory consistency
for a user making use of dirty page tracking.
To get bitmap during unmap, user should set flag
VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP, bitmap memory should be allocated and
zeroed by user space application. Bitmap size and page size should be set
by user application.
Signed-off-by: Kirti Wankhede <kwankhede@xxxxxxxxxx>
Reviewed-by: Neo Jia <cjia@xxxxxxxxxx>
---
drivers/vfio/vfio_iommu_type1.c | 55 ++++++++++++++++++++++++++++++++++++++---
include/uapi/linux/vfio.h | 11 +++++++++
2 files changed, 62 insertions(+), 4 deletions(-)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d6417fb02174..aa1ac30f7854 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -939,7 +939,8 @@ static int verify_bitmap_size(uint64_t npages, uint64_t bitmap_size)
}
static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
- struct vfio_iommu_type1_dma_unmap *unmap)
+ struct vfio_iommu_type1_dma_unmap *unmap,
+ struct vfio_bitmap *bitmap)
{
uint64_t mask;
struct vfio_dma *dma, *dma_last = NULL;
@@ -990,6 +991,10 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
* will be returned if these conditions are not met. The v2 interface
* will only return success and a size of zero if there were no
* mappings within the range.
+ *
+ * When VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP flag is set, unmap request
+ * must be for single mapping. Multiple mappings with this flag set is
+ * not supported.
*/
if (iommu->v2) {
dma = vfio_find_dma(iommu, unmap->iova, 1);
@@ -997,6 +1002,13 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
ret = -EINVAL;
goto unlock;
}
+
+ if ((unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
+ (dma->iova != unmap->iova || dma->size != unmap->size)) {
dma is probably NULL here!
Yep, I didn't look closely enough there. This is situated right
between the check to make sure we're not bisecting a mapping at the
start of the unmap and the check to make sure we're not bisecting a
mapping at the end of the unmap. There's no guarantee that we have a
valid pointer here. The test should be in the while() loop below this
code.
Actually the test could remain here, we can exit here if we can't find
a dma at the start of the unmap range with the GET_DIRTY_BITMAP flag,
but we absolutely cannot deref dma without testing it.
In the check above newly added check, if dma is NULL then its an error
condition, because Unmap requests must fully cover previous mappings, right?
And this restriction on UNMAP would make some UNMAP operations of vIOMMU
fail.
e.g. below condition indeed happens in reality.
an UNMAP ioctl comes for IOVA range from 0xff800000, of size 0x200000
However, IOVAs in this range are mapped page by page.i.e., dma->size is 0x1000.
Previous, this UNMAP ioctl could unmap successfully as a whole.
What triggers this in the guest? Note that it's only when using the
GET_DIRTY_BITMAP flag that this is restricted. Does the event you're
referring to potentially occur under normal circumstances in that mode?
Thanks,
Such unmap would callback vfio_iommu_map_notify() in QEMU. In
vfio_iommu_map_notify(), unmap is called on same range <iova,
iotlb->addr_mask + 1> which was used for map. Secondly unmap with bitmap
will be called only when device state has _SAVING flag set.
Thanks,
Kirti