Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> · Wed, 19 Apr 2017 11:14:51 -0600

On Wed, Apr 19, 2017 at 10:48:51AM -0600, Logan Gunthorpe wrote:
> The pci_enable_p2p_bar function would then just need to call
> devm_memremap_pages with the dma_map callback set to a function that
> does the segment check and the offset calculation.

I don't see a use for the dma_map function pointer at this point..

It doesn't make alot of sense for the completor of the DMA to provide
a mapping op, the mapping process is *path* specific, not specific to
a completer/initiator.

So, I would suggest more like this:

static inline struct device *get_p2p_src(struct page *page)
{
        struct device *res;
	struct dev_pagemap *pgmap;

	if (!is_zone_device_page(page))
	     return NULL;

        pgmap = get_dev_pagemap(page_to_pfn(page), NULL);
        if (!pgmap || pgmap->type !=  MEMORY_DEVICE_P2P)
	        /* For now ZONE_DEVICE memory that is not P2P is
 		   assumed to be configured for DMA the same as CPU
		   memory. */
                return ERR_PTR(-EINVAL);
	res = pgmap->dev;
	device_get(res);
	put_dev_pagemap(pgmap);
	return res;
}

dma_addr_t pci_p2p_same_segment(struct device *initator,
                                struct device *completer,
				struct page *page)
{
   if (! PCI initiator & completer)
       return ERROR;
   if (!same segment initiator & completer)
       return ERROR;

   // Translate page directly to the value programmed into the BAR
   return (Completer's PCI BAR base address) + (offset of page within BAR);
}

// dma_sg_map

for (each sgl) {
    struct page *page = sg_page(s);
    struct device *p2p_src = get_p2p_src(page);

    if (IS_ERR(p2p_src))
        // fail dma_sg

    if (p2p_src) {
        bool needs_iommu = false;

        pa = pci_p2p_same_segment(dev, p2p_src, page);
	if (pa == ERROR)
	    pa = arch_p2p_cross_segment(dev, p2psrc, page, &needs_iommui);

        device_put(p2p_src);

        if (pa == ERROR)
	    // fail

	if (!needs_iommu) {      
	    // Insert PA directly into the result SGL
	    sg++;
	    continue;
	}
    }
    else
        // CPU memory
        pa = page_to_phys(page);

To me it looks like the code duplication across the iommu stuff comes
from just duplicating the basic iommu algorithm in every driver.

To clean that up I think someone would need to hoist the overall sgl
loop and use more ops callbacks eg allocate_iommu_range,
assign_page_to_rage, dealloc_range, etc. This is a problem p2p makes
worse, but isn't directly causing :\

Jason