On Tue, 8 Nov 2016 21:56:29 +0530 Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: > On 11/8/2016 5:15 AM, Alex Williamson wrote: > > On Sat, 5 Nov 2016 02:40:45 +0530 > > Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: > > > ... > >> > >> +int vfio_register_notifier(struct device *dev, struct notifier_block *nb) > > > > Is the expectation here that this is a generic notifier for all > > vfio->mdev signaling? That should probably be made clear in the mdev > > API to avoid vendor drivers assuming their notifier callback only > > occurs for unmaps, even if that's currently the case. > > > > Ok. Adding comment about notifier callback in mdev_device which is part > of next patch. > > ... > > >> mutex_lock(&iommu->lock); > >> > >> - if (!iommu->external_domain) { > >> + /* Fail if notifier list is empty */ > >> + if ((!iommu->external_domain) || (!iommu->notifier.head)) { > >> ret = -EINVAL; > >> goto pin_done; > >> } > >> @@ -867,6 +870,11 @@ unlock: > >> /* Report how much was unmapped */ > >> unmap->size = unmapped; > >> > >> + if (unmapped && iommu->external_domain) > >> + blocking_notifier_call_chain(&iommu->notifier, > >> + VFIO_IOMMU_NOTIFY_DMA_UNMAP, > >> + unmap); > > > > This is after the fact, there's already a gap here where pages are > > unpinned and the mdev device is still running. > > Oh, there is a bug here, now unpin_pages() take user_pfn as argument and > find vfio_dma. If its not found, it doesn't unpin pages. We have to call > this notifier before vfio_remove_dma(). But if we call this before > vfio_remove_dma() there will be deadlock since iommu->lock is already > held here and vfio_iommu_type1_unpin_pages() will also try to hold > iommu->lock. > If we want to call blocking_notifier_call_chain() before > vfio_remove_dma(), sequence should be: > > unmapped += dma->size; > mutex_unlock(&iommu->lock); > if (iommu->external_domain)) { > struct vfio_iommu_type1_dma_unmap nb_unmap; > > nb_unmap.iova = dma->iova; > nb_unmap.size = dma->size; > blocking_notifier_call_chain(&iommu->notifier, > VFIO_IOMMU_NOTIFY_DMA_UNMAP, > &nb_unmap); > } > mutex_lock(&iommu->lock); > vfio_remove_dma(iommu, dma); It seems like it would be worthwhile to have the rb-tree rooted in the vfio-dma, then we only need to call the notifier if there are pages pinned within that vfio-dma (ie. the rb-tree is not empty). We can then release the lock call the notifier, re-acquire the lock, and BUG_ON if the rb-tree still is not empty. We might get duplicate pfns between separate vfio_dma structs, but as I mentioned in other replies, that seems like an exception that we don't need to optimize for. > > The notifier needs to > > happen prior to that and I suspect that we need to validate that we > > have no remaining external pfn references within this vfio_dma block. > > It seems like we need to root our pfn tracking in the vfio_dma so that > > we can see that it's empty after the notifier chain and BUG_ON if not. > > There is no way to find pfns from that iova range with current > implementation. We can have this validate if we go with linear array of > iova to track pfns. Right, I was still hoping to avoid storing the pfn even with the array/page-table approach though, ask the mm layer for the mapping again. Is that too much overhead? Maybe the page table could store the phys addr and we could use PAGE_MASK to store the reference count so that each entry is still only 8bytes(?) > > I would also add some enforcement that external pinning is only enabled > > when vfio_iommu_type1 is configured for v2 semantics (ie. we only > > support unmaps exactly matching previous maps). > > > > Ok I'll add that check. > > Thanks, > Kirti -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html