Re: [PATCH v2 kvmtool 06/10] Add PCI device passthrough using VFIO

Punit Agrawal <punit.agrawal@xxxxxxx> · Mon, 31 Jul 2017 18:52:38 +0100

Hi Jean-Philippe,

A couple of queries below -

Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx> writes:

> Assigning devices using VFIO allows the guest to have direct access to the
> device, whilst filtering accesses to sensitive areas by trapping config
> space accesses and mapping DMA with an IOMMU.
>
> This patch adds a new option to lkvm run: --vfio-group=<group_number>.
> Before assigning a device to a VM, some preparation is required. As
> described in Linux Documentation/vfio.txt, the device driver need to be

Nitpick: "needs"

> changed to vfio-pci:
>
>   $ dev=0000:00:00.0
>
>   $ echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
>   $ echo vfio-pci > /sys/bus/pci/devices/$dev/driver_override
>   $ echo $dev > /sys/bus/pci/drivers_probe
>   $ readlink /sys/bus/pci/devices/$dev/iommu_group
>   ../../../kernel/iommu_groups/5
>
> Adding --vfio[-group]=5 to lkvm-run will pass the device to the guest.
> Multiple groups can be passed to the guest by adding more --vfio
> parameters.
>
> This patch only implements PCI with INTx. MSI-X routing will be added in a
> subsequent patch, and at some point we might add support for passing
> platform devices to guests.
>
> Signed-off-by: Will Deacon <will.deacon@xxxxxxx>
> Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx>
> ---
>  Makefile                 |   2 +
>  arm/pci.c                |   1 +
>  builtin-run.c            |   5 +
>  include/kvm/kvm-config.h |   3 +
>  include/kvm/pci.h        |   3 +-
>  include/kvm/vfio.h       |  57 +++++++
>  vfio/core.c              | 395 +++++++++++++++++++++++++++++++++++++++++++++++
>  vfio/pci.c               | 365 +++++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 830 insertions(+), 1 deletion(-)
>  create mode 100644 include/kvm/vfio.h
>  create mode 100644 vfio/core.c
>  create mode 100644 vfio/pci.c
>

[...]

> +static int vfio_container_init(struct kvm *kvm)
> +{
> +	int api, i, ret, iommu_type;;
> +
> +	/* Create a container for our IOMMU groups */
> +	vfio_container = open(VFIO_DEV_NODE, O_RDWR);
> +	if (vfio_container == -1) {
> +		ret = errno;
> +		pr_err("Failed to open %s", VFIO_DEV_NODE);
> +		return ret;
> +	}
> +
> +	api = ioctl(vfio_container, VFIO_GET_API_VERSION);
> +	if (api != VFIO_API_VERSION) {
> +		pr_err("Unknown VFIO API version %d", api);
> +		return -ENODEV;
> +	}

We are using the VFIO_API_VERSION pulled in from linux/vfio.h. Will
kvmtool be in trouble if for some reason the API version is incompatibly
incremented in the future?

> +
> +	iommu_type = vfio_get_iommu_type();
> +	if (iommu_type < 0) {
> +		pr_err("VFIO type-1 IOMMU not supported on this platform");
> +		return iommu_type;
> +	}
> +
> +	/* Sanity check our groups and add them to the container */
> +	for (i = 0; i < kvm->cfg.num_vfio_groups; ++i) {
> +		ret = vfio_group_init(kvm, &kvm->cfg.vfio_group[i]);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	/* Finalise the container */
> +	if (ioctl(vfio_container, VFIO_SET_IOMMU, iommu_type)) {
> +		ret = -errno;
> +		pr_err("Failed to set IOMMU type %d for VFIO container",
> +		       iommu_type);
> +		return ret;

Just checking - is there a need to remove the groups from the containers
(added by VFIO_GROUP_SET_CONTAINER in vfio_group_init()) before erroring
out here? I imagine freeing of the vfio_container when kvmtool is
cleaning up after itself will do the right thing.

Thanks,
Punit

> +	} else {
> +		pr_info("Using IOMMU type %d for VFIO container", iommu_type);
> +	}
> +
> +	return kvm__for_each_mem_bank(kvm, KVM_MEM_TYPE_RAM, vfio_map_mem_bank,
> +				      NULL);
> +}
> +
> +static int vfio__init(struct kvm *kvm)
> +{
> +	int ret;
> +
> +	if (!kvm->cfg.num_vfio_groups)
> +		return 0;
> +
> +	ret = vfio_container_init(kvm);
> +	if (ret)
> +		return ret;
> +
> +	ret = vfio_configure_iommu_groups(kvm);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +dev_base_init(vfio__init);
> +
> +static int vfio__exit(struct kvm *kvm)
> +{
> +	int i;
> +
> +	if (!kvm->cfg.num_vfio_groups)
> +		return 0;
> +
> +	for (i = 0; i < kvm->cfg.num_vfio_groups; ++i)
> +		vfio_group_exit(kvm, &kvm->cfg.vfio_group[i]);
> +
> +	kvm__for_each_mem_bank(kvm, KVM_MEM_TYPE_RAM, vfio_unmap_mem_bank, NULL);
> +	return close(vfio_container);
> +}
> +dev_base_exit(vfio__exit);

[...]