On Tue, Apr 18, 2017 at 3:42 PM, Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, Apr 18, 2017 at 03:28:17PM -0700, Dan Williams wrote: > >> Unlike the pci bus address offset case which I think is fundamental to >> support since shipping archs do this toda > > But we can support this by modifying those arch's unique dma_ops > directly. > > Eg as I explained, my p2p_same_segment_map_page() helper concept would > do the offset adjustment for same-segement DMA. > > If PPC calls that in their IOMMU drivers then they will have proper > support for this basic p2p, and the right framework to move on to more > advanced cases of p2p. > > This really seems like much less trouble than trying to wrapper all > the arch's dma ops, and doesn't have the wonky restrictions. I don't think the root bus iommu drivers have any business knowing or caring about dma happening between devices lower in the hierarchy. >> I think it is ok to say p2p is restricted to a single sgl that gets >> to talk to host memory or a single device. > > RDMA and GPU would be sad with this restriction... > >> That said, what's wrong with a p2p aware map_sg implementation >> calling up to the host memory map_sg implementation on a per sgl >> basis? > > Setting up the iommu is fairly expensive, so getting rid of the > batching would kill performance.. When we're crossing device and host memory boundaries how much batching is possible? As far as I can see you'll always be splitting the sgl on these dma mapping boundaries.