Hi Alex, > -----Original Message----- > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] > Sent: Thursday, May 24, 2018 7:21 PM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx> > Cc: eric.auger@xxxxxxxxxx; pmorel@xxxxxxxxxxxxxxxxxx; > kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; iommu@lists.linux- > foundation.org; Linuxarm <linuxarm@xxxxxxxxxx>; John Garry > <john.garry@xxxxxxxxxx>; xuwei (O) <xuwei5@xxxxxxxxxx>; Joerg Roedel > <joro@xxxxxxxxxx> > Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list > management > > [Cc +Joerg: AMD-Vi observation towards the end] > > On Wed, 18 Apr 2018 12:40:38 +0100 > Shameer Kolothum <shameerali.kolothum.thodi@xxxxxxxxxx> wrote: > > > This series introduces an iova list associated with a vfio > > iommu. The list is kept updated taking care of iommu apertures, > > and reserved regions. Also this series adds checks for any conflict > > with existing dma mappings whenever a new device group is attached to > > the domain. > > > > User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO > > ioctl capability chains. Any dma map request outside the valid iova > > range will be rejected. > > Hi Shameer, > > I ran into two minor issues in testing this series, both related to > mdev usage of type1. First, in patch 5/7 when we try to validate a dma > map request: I must admit I haven't looked into the mdev use case at all and my impression was that it will be same as others. Thanks for doing these tests. > > +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu, > > + dma_addr_t start, dma_addr_t end) > > +{ > > + struct list_head *iova = &iommu->iova_list; > > + struct vfio_iova *node; > > + > > + list_for_each_entry(node, iova, list) { > > + if ((start >= node->start) && (end <= node->end)) > > + return true; > > + } > > + > > + return false; > > +} > > A container with only an mdev device will have an empty list because it > has not backing iommu to set ranges or reserved regions, so any dma map > will fail. I think this is resolved as follows: > > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -1100,7 +1100,7 @@ static bool vfio_iommu_iova_dma_valid(struct > vfio_iommu *iommu, > return true; > } > > - return false; > + return list_empty(&iommu->iova_list); > } Ok. > ie. return false only if there was anything to test against. > > The second issue is similar, patch 6/7 adds to VFIO_IOMMU_GET_INFO: > > + ret = vfio_iommu_iova_build_caps(iommu, &caps); > + if (ret) > + return ret; > > And build_caps has: > > + list_for_each_entry(iova, &iommu->iova_list, list) > + iovas++; > + > + if (!iovas) { > + ret = -EINVAL; > > Therefore if the iova list is empty, as for mdevs, the use can no > longer even call VFIO_IOMMU_GET_INFO on the container, which is a > regression. Again, I think the fix is simple: > > @@ -2090,7 +2090,7 @@ static int vfio_iommu_iova_build_caps(struct > vfio_iommu *iommu, > iovas++; > > if (!iovas) { > - ret = -EINVAL; > + ret = 0; > goto out_unlock; > } > > ie. build_caps needs to handle lack of an iova_list as a non-error. Ok. > Also, I wrote a small unit test to validate the iova list for my > systems[1]. With the above changes, my Intel test system gives expected > results: > > # ./vfio-type1-iova-list /sys/bus/mdev/devices/c08db5ed-05d3-4b39-b150- > 438a18bc698f /sys/bus/pci/devices/0000:00:1b.0 > ---- Adding device: c08db5ed-05d3-4b39-b150-438a18bc698f ---- > Initial info struct size: 0x18 > No caps > ---- Adding device: 0000:00:1b.0 ---- > Initial info struct size: 0x18 > Requested info struct size: 0x48 > New info struct size: 0x48 > argsz: 0x48, flags: 0x3, cap_offset: 0x18 > 00: 4800 0000 0300 0000 00f0 ffff ffff ffff > 10: 1800 0000 0000 0000 0100 0100 0000 0000 > 20: 0200 0000 0000 0000 0000 0000 0000 0000 > 30: ffff dffe 0000 0000 0000 f0fe 0000 0000 > 40: ffff ffff ffff ff01 > [cap id: 1, version: 1, next: 0x0] > Found type1 iova range version: 1 > 00: 0000000000000000 - 00000000fedfffff > 01: 00000000fef00000 - 01ffffffffffffff > > Adding an mdev device to the container results in no iova list, adding > the physical device updates to the expected set with the MSI range > excluded. > > I was a little surprised by an AMD system: > > # ./vfio-type1-iova-list /sys/bus/pci/devices/0000:01:00.0 > ---- Adding device: 0000:01:00.0 ---- > Initial info struct size: 0x18 > Requested info struct size: 0x58 > New info struct size: 0x58 > argsz: 0x58, flags: 0x3, cap_offset: 0x18 > 00: 5800 0000 0300 0000 00f0 ffff 7fff ffff > 10: 1800 0000 0000 0000 0100 0100 0000 0000 > 20: 0300 0000 0000 0000 0000 0000 0000 0000 > 30: ffff dffe 0000 0000 0000 f0fe 0000 0000 > 40: ffff ffff fc00 0000 0000 0000 0001 0000 > 50: ffff ffff ffff ffff > [cap id: 1, version: 1, next: 0x0] > Found type1 iova range version: 1 > 00: 0000000000000000 - 00000000fedfffff > 01: 00000000fef00000 - 000000fcffffffff > 02: 0000010000000000 - ffffffffffffffff > > The additional excluded range is the HyperTransport area (which for > 99.9+% of use cases isn't going to be a problem) is a rather surprising > limit for the size of a VM we can use on AMD, just under 1TB. I fully > expect we'll see a bug report from that and we should be thinking about > how to address it. Otherwise, if the changes above look good to you > I'll incorporate them into their respective patches and push to my next > branch. Thanks, Yes, the above changes related to list empty cases looks fine to me. Much appreciated, Shameer > Alex > > [1] https://gist.github.com/awilliam/25e39252ab28101a2d89ace1ff765fdd