RE: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource() documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Mark Hounschell [mailto:markh@xxxxxxxxxx]
> Sent: Wednesday, May 20, 2015 2:16 PM
> To: William Davis; Bjorn Helgaas
> Cc: joro@xxxxxxxxxx; iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-
> pci@xxxxxxxxxxxxxxx; Terence Ripperda; John Hubbard; jglisse@xxxxxxxxxx;
> konrad.wilk@xxxxxxxxxx; Jonathan Corbet; David S. Miller
> Subject: Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource()
> documentation
> 
> On 05/20/2015 01:30 PM, William Davis wrote:
> >
> >
> >> -----Original Message-----
> >> From: Mark Hounschell [mailto:markh@xxxxxxxxxx]
> >> Sent: Wednesday, May 20, 2015 7:11 AM
> >> To: Bjorn Helgaas; William Davis
> >> Cc: joro@xxxxxxxxxx; iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-
> >> pci@xxxxxxxxxxxxxxx; Terence Ripperda; John Hubbard;
> >> jglisse@xxxxxxxxxx; konrad.wilk@xxxxxxxxxx; Jonathan Corbet; David S.
> >> Miller
> >> Subject: Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource()
> >> documentation
> >>
> >> On 05/19/2015 07:43 PM, Bjorn Helgaas wrote:
> >>> [+cc Dave, Jonathan]
> >>>
> >>> On Mon, May 18, 2015 at 01:25:01PM -0500, wdavis@xxxxxxxxxx wrote:
> >>>> From: Will Davis <wdavis@xxxxxxxxxx>
> >>>>
> >>>> Add references to both the general API documentation as well as the
> >> HOWTO.
> >>>>
> >>>> Signed-off-by: Will Davis <wdavis@xxxxxxxxxx>
> >>>> ---
> >>>>    Documentation/DMA-API-HOWTO.txt | 39
> >> +++++++++++++++++++++++++++++++++++++--
> >>>>    Documentation/DMA-API.txt       | 36
> +++++++++++++++++++++++++++++++--
> >> ---
> >>>>    2 files changed, 68 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/Documentation/DMA-API-HOWTO.txt
> >>>> b/Documentation/DMA-API-HOWTO.txt index 0f7afb2..89bd730 100644
> >>>> --- a/Documentation/DMA-API-HOWTO.txt
> >>>> +++ b/Documentation/DMA-API-HOWTO.txt
> >>>> @@ -138,6 +138,10 @@ What about block I/O and networking buffers?
> >>>> The
> >> block I/O and
> >>>>    networking subsystems make sure that the buffers they use are valid
> >>>>    for you to DMA from/to.
> >>>>
> >>>> +In some systems, it may also be possible to DMA to and/or from a
> >>>> +peer device's MMIO region, as described by a 'struct resource'.
> >>>> +This is referred to as a peer-to-peer mapping.
> >>>> +
> >>>>    			DMA addressing limitations
> >>>>
> >>>>    Does your device have any DMA addressing limitations?  For
> >>>> example, is @@ -648,6 +652,35 @@ Every dma_map_{single,sg}() call
> >>>> should have its
> >> dma_unmap_{single,sg}()
> >>>>    counterpart, because the bus address space is a shared resource and
> >>>>    you could render the machine unusable by consuming all bus
> addresses.
> >>>>
> >>>> +Peer-to-peer DMA mappings can be obtained using dma_map_resource()
> >>>> +to map another device's MMIO region for the given device:
> >>>> +
> >>>> +	struct resource *peer_mmio_res = &other_dev->resource[0];
> >>>> +	dma_addr_t dma_handle = dma_map_resource(dev, peer_mmio_res,
> >>>> +						 offset, size, direction);
> >>>> +	if (dma_handle == 0 || dma_mapping_error(dev, dma_handle))
> >>>> +	{
> >>>> +		/*
> >>>> +		 * If dma_handle == 0, dma_map_resource() is not
> >>>> +		 * implemented, and peer-to-peer transactions will not
> >>>> +		 * work.
> >>>> +		 */
> >>>> +		goto map_error_handling;
> >>>> +	}
> >>>> +
> >>>> +	...
> >>>> +
> >>>> +	dma_unmap_resource(dev, dma_handle, size, direction);
> >>>> +
> >>>> +Here, "offset" means byte offset within the given resource.
> >>>> +
> >>>> +You should both check for a 0 return value and call
> >>>> +dma_mapping_error(), as dma_map_resource() can either be not
> >>>> +implemented or fail and return error as outlined under the
> >> dma_map_single() discussion.
> >>>> +
> >>>> +You should call dma_unmap_resource() when DMA activity is
> >>>> +finished, e.g., from the interrupt which told you that the DMA
> transfer is done.
> >>>> +
> >>>>    If you need to use the same streaming DMA region multiple times
> >>>> and
> >> touch
> >>>>    the data in between the DMA transfers, the buffer needs to be
> synced
> >>>>    properly in order for the CPU and device to see the most
> >>>> up-to-date and @@ -765,8 +798,8 @@ failure can be determined by:
> >>>>
> >>>>    - checking if dma_alloc_coherent() returns NULL or dma_map_sg
> >>>> returns 0
> >>>>
> >>>> -- checking the dma_addr_t returned from dma_map_single() and
> >>>> dma_map_page()
> >>>> -  by using dma_mapping_error():
> >>>> +- checking the dma_addr_t returned from dma_map_single(),
> >>>> +dma_map_resource(),
> >>>> +  and dma_map_page() by using dma_mapping_error():
> >>>>
> >>>>    	dma_addr_t dma_handle;
> >>>>
> >>>> @@ -780,6 +813,8 @@ failure can be determined by:
> >>>>    		goto map_error_handling;
> >>>>    	}
> >>>>
> >>>> +- checking if dma_map_resource() returns 0
> >>>> +
> >>>>    - unmap pages that are already mapped, when mapping error occurs
> >>>> in
> >> the middle
> >>>>      of a multiple page mapping attempt. These example are applicable
> to
> >>>>      dma_map_page() as well.
> >>>> diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
> >>>> index 5208840..c25c549 100644
> >>>> --- a/Documentation/DMA-API.txt
> >>>> +++ b/Documentation/DMA-API.txt
> >>>> @@ -283,14 +283,40 @@ and <size> parameters are provided to do
> >>>> partial
> >> page mapping, it is
> >>>>    recommended that you never use these unless you really know what
> the
> >>>>    cache width is.
> >>>>
> >>>> +dma_addr_t
> >>>> +dma_map_resource(struct device *dev, struct resource *res,
> >>>> +		 unsigned long offset, size_t size,
> >>>> +		 enum dma_data_direction_direction)
> >>>> +
> >>>> +API for mapping resources. This API allows a driver to map a peer
> >>>> +device's resource for DMA. All the notes and warnings for the
> >>>> +other APIs apply here. Also, the success of this API does not
> >>>> +validate or guarantee that peer-to-peer transactions between the
> >>>> +device and its peer will be functional. They only grant access so
> >>>> +that if such transactions are possible, an IOMMU will not prevent
> >>>> +them from succeeding.
> >>>
> >>> If the driver can't tell whether peer-to-peer accesses will actually
> >>> work, this seems like sort of a dubious API.  I'm trying to imagine
> >>> how a driver would handle this.  I guess whether peer-to-peer works
> >>> depends on the underlying platform (not the devices themselves)?  If
> >>> we run the driver on a platform where peer-to-peer *doesn't* work,
> >>> what happens?  The driver can't tell, so we just rely on the user to
> >>> say "this isn't working as expected"?
> >>>
> >>
> >
> > Yes, it's quite difficult to tell whether peer-to-peer will actually
> > work, and it usually involves some probing and heuristics on the driver's
> part.
> > I wouldn't say that this makes it a dubious API - it's a piece of the
> > puzzle that's absolutely necessary for a driver to set up peer-to-peer
> > in an IOMMU environment.
> >
> 
> I currently just do
> 
> page = virt_to_page(__va(bus_address));
> 
> then just use the normal API. Works for writes anyway.
> 

What platform is this on? I don't understand how that could work for
peer-to-peer. As I understand it, there are no 'struct page's for MMIO
regions, and you could actually end up with a very different page as a
result of that unchecked translation (e.g., a "valid" struct page, but
not in the peer's MMIO range at all).

> >> Most currently available hardware doesn't allow reads but will allow
> >> writes on PCIe peer-to-peer transfers. All current AMD chipsets are
> >> this way. I'm pretty sure all Intel chipsets are this way also.
> >
> > Most != all. As an example, Mellanox offers the ability to do
> > peer-to-peer
> > transfers:
> >
> > http://www.mellanox.com/page/products_dyn?product_family=116
> >
> > which would indicate there is at least some platform out there which
> > allows peer-to-peer reads. I don't think that that being a minority
> > configuration should preclude it from support.
> >
> >> What happens with reads
> >> is they are just dropped with no indication of error other than the
> >> data will not be as expected. Supposedly the PCIe spec does not even
> >> require any peer-to-peer support. Regular PCI there is no problem and
> >> this API could be useful. However I doubt seriously you will find a
> >> pure PCI motherboard that has an IOMMU.
> >>
> >> I don't understand the chipset manufactures reasoning for disabling
> >> PCIe peer-to-peer reads. We would like to make PCIe versions of our
> >> cards but their application requires  peer-to-peer reads and writes.
> >> So we cannot develop PCIe versions of the cards.
> >>
> >> Again, Regular PCI there is no problem and this API could be useful.
> >> IOMMU or not.
> >> If we had a pure PCI with IOMMU env, how will this API handle when
> >> the 2 devices are on the same PCI bus. There will be NO IOMMU between
> >> the devices on the same bus. Does this API address that configuration?
> >>
> >
> > What is the expected behavior in this configuration? That the "mapping"
> > simply be the bus address (as in the nommu case)?
> >
> 
> I suspect just using the bus address would sort of defeat one or more
> purposes of the IOMMU. The bus address would certainly be what I would want
> to use though.
> 
> > In an IOMMU environment, the DMA ops would be one of the IOMMU
> > implementations, so these APIs would create a mapping for the peer
> > device resource, even if it's on the same bus. Would a transaction
> > targeting that mapping be forwarded upstream until it hits an IOMMU,
> > which would then send the translated request back downstream? Or is my
> > understanding of this configuration incorrect?
> >
> 
> It's my understanding of the IOMMU that is lacking here. I have no idea if
> that is actually what would happen. Does it?

It's a mildly educated guess, but I haven't tested this configuration to
know for sure.

Thanks,
Will

--
nvpublic

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux