On Thu, Sep 21, 2017 at 08:00:24PM +1000, Alexey Kardashevskiy wrote: > Clearing very big IOMMU tables can trigger soft lockups. This adds > cond_resched() for every million TCE updates. > > The testcase is POWER9 box with 264GB guest, 4 VFIO devices from > independent IOMMU groups, 64K IOMMU pages. This configuration produces > 4325376 TCE entries, each entry update incurs 4 OPAL calls to update > an individual PE TCE cache. Reducing table size to 4194304 (i.e. 256GB > guest) or removing one of 4 VFIO devices makes the problem go away so > doing cond_resched() after every million TCE updates seems sufficient. > > Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx> Reviewed-by: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx> > --- > drivers/vfio/vfio_iommu_spapr_tce.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c > index 63112c36ab2d..be3839ea3150 100644 > --- a/drivers/vfio/vfio_iommu_spapr_tce.c > +++ b/drivers/vfio/vfio_iommu_spapr_tce.c > @@ -502,11 +502,15 @@ static int tce_iommu_clear(struct tce_container *container, > struct iommu_table *tbl, > unsigned long entry, unsigned long pages) > { > - unsigned long oldhpa; > + unsigned long oldhpa, n; > long ret; > enum dma_data_direction direction; > > - for ( ; pages; --pages, ++entry) { > + for (n = 0; pages; --pages, ++entry, ++n) { > + > + if (n && (n % 1000000 == 0)) > + cond_resched(); > + > direction = DMA_NONE; > oldhpa = 0; > ret = iommu_tce_xchg(tbl, entry, &oldhpa, &direction); -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature