Re: s390-iommu.c default domain conversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2022-05-10 at 13:09 -0300, Jason Gunthorpe wrote:
> On Tue, May 10, 2022 at 11:25:54AM -0400, Matthew Rosato wrote:
> > On 5/9/22 7:35 PM, Jason Gunthorpe wrote:
> > > Hi s390 folks/Matthew
> > > 
> > > Since everyone is looking at iommu support for the nested domains,
> > > could we also tackle the default domain conversion please? s390 is one
> > > of the last 4 drivers that need it.
> > > 
> > >  From what I can see it looks like when detach_dev() is called it
> > > expects the platform's dma_ops to work in arch/s390/pci/pci_dma.c ?
> > 
> > Yes
> > 
> > > Has anyone thought about converting the dma_ops to use the normal DMA
> > > API iommu support and run it through the iommu driver instead of
> > > through the dma_ops?
> > > 
> > > Alternatively perhaps we can keep the dma_ops with some iommu
> > > side-change.
> > 
> > It has come up before.  So ultimately the goal is to be driving the dma
> > through the default iommu domain (via dma-iommu) rather than directly in the
> > dma_ops?  One of our main concerns is performance loss from s390-ism
> > optimizations in the dma_ops like RPCIT avoidance / lazy map +
> > global flush
> 
> The core version is somewhat different, it triggers the
> iotlb_flush_all from a timer, not just on address space wrap around,
> but the fast path on unmap can still skip the zpci_refresh_trans().
> 
> On the other hand it doesn't have the limit of iova space, and the
> iova allocator is somewhat more sophisticated which will optimize
> large page cases that s390 currently doesn't. Basically it will work
> better with things like mlx5 cards in the normal case.
> 
> The lasy flush is done via the IOMMU_DOMAIN_DMA_FQ and the iommu gather->queued
> stuff to allow skipping the RCPIT during the normal iotlb_sync.
> 
> > I think the reality is that Niklas and I need to have a close look and do
> > some testing on our end to see what it will take and if we can get
> > acceptable performance from a conversion, then get back to you.
> 
> It would be a good long term goal, getting rid of these duplicated
> dma_ops is another open task. There is a patch series out there to
> convert arm, so this whole area will become even more niche.
> 
> But another path is to somehow keep them and just allow default
> domains to work - ARM did this.
> 
> Jason

I did some testing and created a prototype that gets rid of
arch/s390/pci_dma.c and works soley via dma-iommu on top of our IOMMU
driver. It looks like the existing dma-iommu code allows us to do this
with relatively simple changes to the IOMMU driver only, mostly just
implementing iotlb_sync(), iotlb_sync_map() and flush_iotlb_all() so
that's great. They also do seem to map quite well to our RPCIT I/O TLB
flush so that's great. For now the prototype still uses 4k pages only.

With that the performance on the LPAR machine hypervisor (no paging) is
on par with our existing code. On paging hypervisors (z/VM and KVM)
i.e. with the hypervisor shadowing the I/O translation tables, it's
still slower than our existing code and interestingly strict mode seems
to be better than lazy here. One thing I haven't done yet is implement
the map_pages() operation or adding larger page sizes. Maybe you have
some tips what you'd expect to be most beneficial? Either way we're
optimistic this can be solved and this conversion will be a high
ranking item on my backlog going forward.

I also stumbled over the following patch series which I think would
also help our paging hypervisor cases a lot since it should alleviate
the cost of shadowing short lived mappings:

https://lore.kernel.org/linux-iommu/20210806103423.3341285-1-stevensd@xxxxxxxxxx/

Sadly it seems it hasn't gained much traction so far.

Thanks,
Niklas




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux