On Tue, 2023-01-03 at 09:16 +0100, Niklas Schnelle wrote: > On Mon, 2023-01-02 at 19:25 +0100, Alexandra Winter wrote: > > > > On 02.01.23 12:56, Niklas Schnelle wrote: > > > On s390 .iotlb_sync_map is used to sync mappings to an underlying > > > hypervisor by letting the hypervisor inspect the synced IOVA range and > > > updating its shadow table. This however means that it can fail as the > > > hypervisor may run out of resources. This can be due to the hypervisor > > > being unable to pin guest pages, due to a limit on concurrently mapped > > > addresses such as vfio_iommu_type1.dma_entry_limit or other resources. > > > Either way such a failure to sync a mapping should result in > > > a DMA_MAPPING_EROR. > > > > > > Now especially when running with batched IOTLB flushes for unmap it may > > > be that some IOVAs have already been invalidated but not yet synced via > > > .iotlb_sync_map. Thus if the hypervisor indicates running out of > > > resources, first do a global flush allowing the hypervisor to free > > > resources associated with these mappings and only if that also fails > > > report this error to callers. > > > > > > Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx> > > > --- > > Just a small typo, I noticed > > [...] > > You mean the misspelled DMA_MAPPING_ERROR, right? Either way I did edit > the commit message for a bit more clarity on some of the details: > > On s390 when using a paging hypervisor, .iotlb_sync_map is used to sync > mappings by letting the hypervisor inspect the synced IOVA range and > updating a shadow table. This however means that .iotlb_sync_map can > fail as the hypervisor may run out of resources while doing the sync. > This can be due to the hypervisor being unable to pin guest pages, due > to a limit on mapped addresses such as vfio_iommu_type1.dma_entry_limit > or lack of other resources. Either way such a failure to sync a mapping > should result in a DMA_MAPPING_ERROR. > > Now especially when running with batched IOTLB flushes for unmap it may > be that some IOVAs have already been invalidated but not yet synced via > .iotlb_sync_map. Thus if the hypervisor indicates running out of > resources, first do a global flush allowing the hypervisor to free > resources associated with these mappings as well a retry creating the > new mappings and only if that also fails report this error to callers. > > > > > > diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c > > > index ed33c6cce083..6ba38b4f5b37 100644 > > > --- a/drivers/iommu/s390-iommu.c > > > +++ b/drivers/iommu/s390-iommu.c > > > @@ -210,6 +210,14 @@ static void s390_iommu_release_device(struct device *dev) > > > __s390_iommu_detach_device(zdev); > > > } > > > > > > + > > > +static int zpci_refresh_all(struct zpci_dev *zdev) > > > +{ > > > + return zpci_refresh_trans((u64)zdev->fh << 32, zdev->start_dma, > > > + zdev->end_dma - zdev->start_dma + 1); > > > + > > > +} > > > + > > > static void s390_iommu_flush_iotlb_all(struct iommu_domain *domain) > > > { > > > struct s390_domain *s390_domain = to_s390_domain(domain); > > > @@ -217,8 +225,7 @@ static void s390_iommu_flush_iotlb_all(struct iommu_domain *domain) > > > > > > rcu_read_lock(); > > > list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { > > > - zpci_refresh_trans((u64)zdev->fh << 32, zdev->start_dma, > > > - zdev->end_dma - zdev->start_dma + 1); > > > + zpci_refresh_all(zdev); > > > } > > > rcu_read_unlock(); > > > } > > > @@ -242,20 +249,32 @@ static void s390_iommu_iotlb_sync(struct iommu_domain *domain, > > > rcu_read_unlock(); > > > } > > > > > > -static void s390_iommu_iotlb_sync_map(struct iommu_domain *domain, > > > +static int s390_iommu_iotlb_sync_map(struct iommu_domain *domain, > > > unsigned long iova, size_t size) > > > { > > > struct s390_domain *s390_domain = to_s390_domain(domain); > > > struct zpci_dev *zdev; > > > + int ret = 0; > > > > > > rcu_read_lock(); > > > list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { > > > if (!zdev->tlb_refresh) > > > continue; > > > - zpci_refresh_trans((u64)zdev->fh << 32, > > > - iova, size); > > > + ret = zpci_refresh_trans((u64)zdev->fh << 32, > > > + iova, size); > > > + /* > > > + * let the hypervisor disover invalidated entries > > typo: s/disover/discover/g I'm blind, missed this one. Added the change now. > > > + * allowing it to free IOVAs and unpin pages > > > + */ > > > + if (ret == -ENOMEM) { > > > + ret = zpci_refresh_all(zdev); > > > + if (ret) > > > + break; > > > + } > > > } > > > rcu_read_unlock(); > > > + > > > + return ret; > > > } > > > > > > static int s390_iommu_validate_trans(struct s390_domain *s390_domain, > > [...] >