Re: [PATCH v2 kvmtool 06/10] Add PCI device passthrough using VFIO

Punit Agrawal <punit.agrawal@xxxxxxx> · Thu, 03 Aug 2017 10:36:41 +0100

Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx> writes:

> Hi Punit,
>
> Thanks for reviewing and testing
>
> On 31/07/17 18:52, Punit Agrawal wrote:
>> Hi Jean-Philippe,
>> 
>> A couple of queries below -
>> 
>> Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx> writes:
>> 
>>> Assigning devices using VFIO allows the guest to have direct access to the
>>> device, whilst filtering accesses to sensitive areas by trapping config
>>> space accesses and mapping DMA with an IOMMU.
>>>
>>> This patch adds a new option to lkvm run: --vfio-group=<group_number>.
>>> Before assigning a device to a VM, some preparation is required. As
>>> described in Linux Documentation/vfio.txt, the device driver need to be
>> 
>> Nitpick: "needs"
>> 
>>> changed to vfio-pci:
>>>
>>>   $ dev=0000:00:00.0
>>>
>>>   $ echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
>>>   $ echo vfio-pci > /sys/bus/pci/devices/$dev/driver_override
>>>   $ echo $dev > /sys/bus/pci/drivers_probe
>>>   $ readlink /sys/bus/pci/devices/$dev/iommu_group
>>>   ../../../kernel/iommu_groups/5
>>>
>>> Adding --vfio[-group]=5 to lkvm-run will pass the device to the guest.
>>> Multiple groups can be passed to the guest by adding more --vfio
>>> parameters.
>>>
>>> This patch only implements PCI with INTx. MSI-X routing will be added in a
>>> subsequent patch, and at some point we might add support for passing
>>> platform devices to guests.
>>>
>>> Signed-off-by: Will Deacon <will.deacon@xxxxxxx>
>>> Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx>
>>> ---
>>>  Makefile                 |   2 +
>>>  arm/pci.c                |   1 +
>>>  builtin-run.c            |   5 +
>>>  include/kvm/kvm-config.h |   3 +
>>>  include/kvm/pci.h        |   3 +-
>>>  include/kvm/vfio.h       |  57 +++++++
>>>  vfio/core.c              | 395 +++++++++++++++++++++++++++++++++++++++++++++++
>>>  vfio/pci.c               | 365 +++++++++++++++++++++++++++++++++++++++++++
>>>  8 files changed, 830 insertions(+), 1 deletion(-)
>>>  create mode 100644 include/kvm/vfio.h
>>>  create mode 100644 vfio/core.c
>>>  create mode 100644 vfio/pci.c
>>>
>> 
>> [...]
>> 
>>> +static int vfio_container_init(struct kvm *kvm)
>>> +{
>>> +	int api, i, ret, iommu_type;;
>>> +
>>> +	/* Create a container for our IOMMU groups */
>>> +	vfio_container = open(VFIO_DEV_NODE, O_RDWR);
>>> +	if (vfio_container == -1) {
>>> +		ret = errno;
>>> +		pr_err("Failed to open %s", VFIO_DEV_NODE);
>>> +		return ret;
>>> +	}
>>> +
>>> +	api = ioctl(vfio_container, VFIO_GET_API_VERSION);
>>> +	if (api != VFIO_API_VERSION) {
>>> +		pr_err("Unknown VFIO API version %d", api);
>>> +		return -ENODEV;
>>> +	}
>> 
>> We are using the VFIO_API_VERSION pulled in from linux/vfio.h. Will
>> kvmtool be in trouble if for some reason the API version is incompatibly
>> incremented in the future?
>
> There are no precedents for this, as VFIO version is still 0. Additions to
> the API are added in a backward-compatible way (for example by adding new
> flags in the info structures and new fields at the end). If something
> significant enough happened and required to increase the version number,
> then the next version would probably be incompatible, and kvmtool would
> have to support it separately.

In that case, we should use a local version number macro for comparison
and if ever VFIO api changes kvmtool will explicitly complain instead of
silently dealing with an unknown version.

>
>>> +
>>> +	iommu_type = vfio_get_iommu_type();
>>> +	if (iommu_type < 0) {
>>> +		pr_err("VFIO type-1 IOMMU not supported on this platform");
>>> +		return iommu_type;
>>> +	}
>>> +
>>> +	/* Sanity check our groups and add them to the container */
>>> +	for (i = 0; i < kvm->cfg.num_vfio_groups; ++i) {
>>> +		ret = vfio_group_init(kvm, &kvm->cfg.vfio_group[i]);
>>> +		if (ret)
>>> +			return ret;
>>> +	}
>>> +
>>> +	/* Finalise the container */
>>> +	if (ioctl(vfio_container, VFIO_SET_IOMMU, iommu_type)) {
>>> +		ret = -errno;
>>> +		pr_err("Failed to set IOMMU type %d for VFIO container",
>>> +		       iommu_type);
>>> +		return ret;
>> 
>> Just checking - is there a need to remove the groups from the containers
>> (added by VFIO_GROUP_SET_CONTAINER in vfio_group_init()) before erroring
>> out here? I imagine freeing of the vfio_container when kvmtool is
>> cleaning up after itself will do the right thing.
>
> Yes, VFIO cleans everything up when the container and group file
> descriptors are released.

Thanks for confirming!