RE: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi interrupt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> Sent: Tuesday, October 06, 2015 4:15 AM
> To: Bhushan Bharat-R65777 <Bharat.Bhushan@xxxxxxxxxxxxx>
> Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx;
> christoffer.dall@xxxxxxxxxx; eric.auger@xxxxxxxxxx; pranavkumar@xxxxxxxxxx;
> marc.zyngier@xxxxxxx; will.deacon@xxxxxxx
> Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi
> interrupt
> 
> On Mon, 2015-10-05 at 07:20 +0000, Bhushan Bharat wrote:
> >
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> > > Sent: Saturday, October 03, 2015 4:17 AM
> > > To: Bhushan Bharat-R65777 <Bharat.Bhushan@xxxxxxxxxxxxx>
> > > Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx;
> > > christoffer.dall@xxxxxxxxxx; eric.auger@xxxxxxxxxx;
> > > pranavkumar@xxxxxxxxxx; marc.zyngier@xxxxxxx; will.deacon@xxxxxxx
> > > Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi
> > > interrupt
> > >
> > > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > > > An MSI-address is allocated and programmed in pcie device during
> > > > interrupt configuration. Now for a pass-through device, try to
> > > > create the iommu mapping for this allocted/programmed msi-address.
> > > > If the iommu mapping is created and the msi address programmed in
> > > > the pcie device is different from msi-iova as per iommu
> > > > programming then reconfigure the pci device to use msi-iova as msi
> address.
> > > >
> > > > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@xxxxxxxxxxxxx>
> > > > ---
> > > >  drivers/vfio/pci/vfio_pci_intrs.c | 36
> > > > ++++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 34 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c
> > > > b/drivers/vfio/pci/vfio_pci_intrs.c
> > > > index 1f577b4..c9690af 100644
> > > > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> > > > +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> > > > @@ -312,13 +312,23 @@ static int vfio_msi_set_vector_signal(struct
> > > vfio_pci_device *vdev,
> > > >  	int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector;
> > > >  	char *name = msix ? "vfio-msix" : "vfio-msi";
> > > >  	struct eventfd_ctx *trigger;
> > > > +	struct msi_msg msg;
> > > > +	struct vfio_device *device;
> > > > +	uint64_t msi_addr, msi_iova;
> > > >  	int ret;
> > > >
> > > >  	if (vector >= vdev->num_ctx)
> > > >  		return -EINVAL;
> > > >
> > > > +	device = vfio_device_get_from_dev(&pdev->dev);
> > >
> > > Have you looked at this function?  I don't think we want to be doing
> > > that every time we want to poke the interrupt configuration.
> >
> > I am trying to describe what I understood, a device can have many
> interrupts and we should setup iommu only once, when called for the first
> time to enable/setup interrupt.
> > Similarly when disabling the interrupt we should iommu-unmap when
> > called for the last enabled interrupt for that device. Now with this
> > understanding, should I move this map-unmap to separate functions and
> > call them from vfio_msi_set_block() rather than in
> > vfio_msi_set_vector_signal()
> 
> Interrupts can be setup and torn down at any time and I don't see how one
> function or the other makes much difference.
> vfio_device_get_from_dev() is enough overhead that the data we need
> should be cached if we're going to call it with some regularity.  Maybe
> vfio_iommu_driver_ops.open() should be called with a pointer to the
> vfio_device... or the vfio_group.

vfio_iommu_driver_ops.open() ? or do you mean vfio_pci_open() should be called with vfio_device or vfio_group, and we will cache that in vfio_pci_device ?

> 
> > >  Also note that
> > > IOMMU mappings don't operate on devices, but groups, so maybe we
> > > want to pass the group.
> >
> > Yes, it operates on group. I hesitated to add an API to get group. Do you
> suggest to that it is ok to add API to get group from device.
> 
> No, the above suggestion is probably better.
> 
> > >
> > > > +	if (device == NULL)
> > > > +		return -EINVAL;
> > >
> > > This would be a legitimate BUG_ON(!device)
> > >
> > > > +
> > > >  	if (vdev->ctx[vector].trigger) {
> > > >  		free_irq(irq, vdev->ctx[vector].trigger);
> > > > +		get_cached_msi_msg(irq, &msg);
> > > > +		msi_iova = ((u64)msg.address_hi << 32) | msg.address_lo;
> > > > +		vfio_device_unmap_msi(device, msi_iova, PAGE_SIZE);
> > > >  		kfree(vdev->ctx[vector].name);
> > > >  		eventfd_ctx_put(vdev->ctx[vector].trigger);
> > > >  		vdev->ctx[vector].trigger = NULL; @@ -346,12 +356,11 @@
> static
> > > > int vfio_msi_set_vector_signal(struct
> > > vfio_pci_device *vdev,
> > > >  	 * cached value of the message prior to enabling.
> > > >  	 */
> > > >  	if (msix) {
> > > > -		struct msi_msg msg;
> > > > -
> > > >  		get_cached_msi_msg(irq, &msg);
> > > >  		pci_write_msi_msg(irq, &msg);
> > > >  	}
> > > >
> > > > +
> > >
> > > gratuitous newline
> > >
> > > >  	ret = request_irq(irq, vfio_msihandler, 0,
> > > >  			  vdev->ctx[vector].name, trigger);
> > > >  	if (ret) {
> > > > @@ -360,6 +369,29 @@ static int vfio_msi_set_vector_signal(struct
> > > vfio_pci_device *vdev,
> > > >  		return ret;
> > > >  	}
> > > >
> > > > +	/* Re-program the new-iova in pci-device in case there is
> > > > +	 * different iommu-mapping created for programmed msi-address.
> > > > +	 */
> > > > +	get_cached_msi_msg(irq, &msg);
> > > > +	msi_iova = 0;
> > > > +	msi_addr = (u64)(msg.address_hi) << 32 | (u64)(msg.address_lo);
> > > > +	ret = vfio_device_map_msi(device, msi_addr, PAGE_SIZE,
> > > &msi_iova);
> > > > +	if (ret) {
> > > > +		free_irq(irq, vdev->ctx[vector].trigger);
> > > > +		kfree(vdev->ctx[vector].name);
> > > > +		eventfd_ctx_put(trigger);
> > > > +		return ret;
> > > > +	}
> > > > +
> > > > +	/* Reprogram only if iommu-mapped iova is different from msi-
> > > address */
> > > > +	if (msi_iova && (msi_iova != msi_addr)) {
> > > > +		msg.address_hi = (u32)(msi_iova >> 32);
> > > > +		/* Keep Lower bits from original msi message address */
> > > > +		msg.address_lo &= PAGE_MASK;
> > > > +		msg.address_lo |= (u32)(msi_iova & 0x00000000ffffffff);
> > >
> > > Seems like you're making some assumptions here that are dependent on
> > > the architecture and maybe the platform.
> >
> > What I tried is to map the msi page with different iova, which is page size
> aligned. But the offset within the page will remain same.
> > For example, original msi address was 0x0603_0040 and we have a reserved
> region at 0xf000_0000. So iommu mapping is created for 0xf000_0000
> =>0x0600_3000 of size 0x1000.
> >
> > So the new address to be programmed in device is 0xf000_0040, offset
> 0x40 added to base address in iommu mapping.
> 
> Don't you need ~PAGE_MASK for it to work like that?  The & with
> 0x00000000ffffffff shouldn't be needed either, certainly not with all the
> leading zeros.

Yes, I think ~PAGE_MSK can be used.

Thanks
-Bharat

> 
> > > > +		pci_write_msi_msg(irq, &msg);
> > > > +	}
> > > > +
> > > >  	vdev->ctx[vector].trigger = trigger;
> > > >
> > > >  	return 0;
> > >
> > >
> >
> 
> 

��.n��������+%������w��{.n�����o�^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux