On Mon, 2015-10-05 at 07:20 +0000, Bhushan Bharat wrote: > > > > -----Original Message----- > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] > > Sent: Saturday, October 03, 2015 4:17 AM > > To: Bhushan Bharat-R65777 <Bharat.Bhushan@xxxxxxxxxxxxx> > > Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; > > christoffer.dall@xxxxxxxxxx; eric.auger@xxxxxxxxxx; pranavkumar@xxxxxxxxxx; > > marc.zyngier@xxxxxxx; will.deacon@xxxxxxx > > Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi > > interrupt > > > > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote: > > > An MSI-address is allocated and programmed in pcie device during > > > interrupt configuration. Now for a pass-through device, try to create > > > the iommu mapping for this allocted/programmed msi-address. If the > > > iommu mapping is created and the msi address programmed in the pcie > > > device is different from msi-iova as per iommu programming then > > > reconfigure the pci device to use msi-iova as msi address. > > > > > > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@xxxxxxxxxxxxx> > > > --- > > > drivers/vfio/pci/vfio_pci_intrs.c | 36 > > > ++++++++++++++++++++++++++++++++++-- > > > 1 file changed, 34 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c > > > b/drivers/vfio/pci/vfio_pci_intrs.c > > > index 1f577b4..c9690af 100644 > > > --- a/drivers/vfio/pci/vfio_pci_intrs.c > > > +++ b/drivers/vfio/pci/vfio_pci_intrs.c > > > @@ -312,13 +312,23 @@ static int vfio_msi_set_vector_signal(struct > > vfio_pci_device *vdev, > > > int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector; > > > char *name = msix ? "vfio-msix" : "vfio-msi"; > > > struct eventfd_ctx *trigger; > > > + struct msi_msg msg; > > > + struct vfio_device *device; > > > + uint64_t msi_addr, msi_iova; > > > int ret; > > > > > > if (vector >= vdev->num_ctx) > > > return -EINVAL; > > > > > > + device = vfio_device_get_from_dev(&pdev->dev); > > > > Have you looked at this function? I don't think we want to be doing that > > every time we want to poke the interrupt configuration. > > I am trying to describe what I understood, a device can have many interrupts and we should setup iommu only once, when called for the first time to enable/setup interrupt. > Similarly when disabling the interrupt we should iommu-unmap when called for the last enabled interrupt for that device. Now with this understanding, should I move this map-unmap to separate functions and call them from vfio_msi_set_block() rather than in vfio_msi_set_vector_signal() Interrupts can be setup and torn down at any time and I don't see how one function or the other makes much difference. vfio_device_get_from_dev() is enough overhead that the data we need should be cached if we're going to call it with some regularity. Maybe vfio_iommu_driver_ops.open() should be called with a pointer to the vfio_device... or the vfio_group. > > Also note that > > IOMMU mappings don't operate on devices, but groups, so maybe we want > > to pass the group. > > Yes, it operates on group. I hesitated to add an API to get group. Do you suggest to that it is ok to add API to get group from device. No, the above suggestion is probably better. > > > > > + if (device == NULL) > > > + return -EINVAL; > > > > This would be a legitimate BUG_ON(!device) > > > > > + > > > if (vdev->ctx[vector].trigger) { > > > free_irq(irq, vdev->ctx[vector].trigger); > > > + get_cached_msi_msg(irq, &msg); > > > + msi_iova = ((u64)msg.address_hi << 32) | msg.address_lo; > > > + vfio_device_unmap_msi(device, msi_iova, PAGE_SIZE); > > > kfree(vdev->ctx[vector].name); > > > eventfd_ctx_put(vdev->ctx[vector].trigger); > > > vdev->ctx[vector].trigger = NULL; > > > @@ -346,12 +356,11 @@ static int vfio_msi_set_vector_signal(struct > > vfio_pci_device *vdev, > > > * cached value of the message prior to enabling. > > > */ > > > if (msix) { > > > - struct msi_msg msg; > > > - > > > get_cached_msi_msg(irq, &msg); > > > pci_write_msi_msg(irq, &msg); > > > } > > > > > > + > > > > gratuitous newline > > > > > ret = request_irq(irq, vfio_msihandler, 0, > > > vdev->ctx[vector].name, trigger); > > > if (ret) { > > > @@ -360,6 +369,29 @@ static int vfio_msi_set_vector_signal(struct > > vfio_pci_device *vdev, > > > return ret; > > > } > > > > > > + /* Re-program the new-iova in pci-device in case there is > > > + * different iommu-mapping created for programmed msi-address. > > > + */ > > > + get_cached_msi_msg(irq, &msg); > > > + msi_iova = 0; > > > + msi_addr = (u64)(msg.address_hi) << 32 | (u64)(msg.address_lo); > > > + ret = vfio_device_map_msi(device, msi_addr, PAGE_SIZE, > > &msi_iova); > > > + if (ret) { > > > + free_irq(irq, vdev->ctx[vector].trigger); > > > + kfree(vdev->ctx[vector].name); > > > + eventfd_ctx_put(trigger); > > > + return ret; > > > + } > > > + > > > + /* Reprogram only if iommu-mapped iova is different from msi- > > address */ > > > + if (msi_iova && (msi_iova != msi_addr)) { > > > + msg.address_hi = (u32)(msi_iova >> 32); > > > + /* Keep Lower bits from original msi message address */ > > > + msg.address_lo &= PAGE_MASK; > > > + msg.address_lo |= (u32)(msi_iova & 0x00000000ffffffff); > > > > Seems like you're making some assumptions here that are dependent on the > > architecture and maybe the platform. > > What I tried is to map the msi page with different iova, which is page size aligned. But the offset within the page will remain same. > For example, original msi address was 0x0603_0040 and we have a reserved region at 0xf000_0000. So iommu mapping is created for 0xf000_0000 =>0x0600_3000 of size 0x1000. > > So the new address to be programmed in device is 0xf000_0040, offset 0x40 added to base address in iommu mapping. Don't you need ~PAGE_MASK for it to work like that? The & with 0x00000000ffffffff shouldn't be needed either, certainly not with all the leading zeros. > > > + pci_write_msi_msg(irq, &msg); > > > + } > > > + > > > vdev->ctx[vector].trigger = trigger; > > > > > > return 0; > > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html