Re: understanding DMA in PCI drivers

Ran Shalit <ranshalit@xxxxxxxxx> · Thu, 8 Feb 2018 09:01:14 +0200

On Wed, Feb 7, 2018 at 7:46 PM, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> On Wed, Feb 07, 2018 at 05:03:34PM +0200, Ran Shalit wrote:
>> On Wed, Feb 7, 2018 at 4:39 PM, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>> > On Tue, Feb 06, 2018 at 09:02:49PM +0200, Ran Shalit wrote:
>> >> Hello,
>> >>
>> >> I write a pci linux driver, and after reviewing some examples in
>> >> kernel, there is issue I am not sure about:
>> >> DMA seems to follow the following steps in PCI:
>> >> 1.  driver allocate dma buffers and map them to BAR.
>> >
>> > The driver needs to allocate and map buffers for DMA.  See
>> > Documentation/DMA-API-HOWTO.txt for a start.
>> >
>> > DMA has nothing to do with BARs.  BARs describe address space on the
>> > *device* that can be accessed by the CPU.  BARs are used for
>> > "programmed I/O", not for DMA.
>> >
>> >> 2. driver initialize device to be familiar with dma addresses.
>> >
>> > When the driver creates a DMA mapping, for example, by calling
>> > dma_map_single(), it gets the DMA address.  It can then program the
>> > DMA address into the device.
>> >
>> >> 3. tx from cpu to device: driver trigger dma, by writing into device BARs.
>> >
>> > If you mean the device is doing a DMA read from system memory, the
>> > driver must first allocate and DMA-map a buffer, then tell the device
>> > what the DMA address is, then tell the device to start the DMA.  The
>> > driver would typically use the BARs to give the device the buffer's
>> > DMA address and the "DMA start" command.
>> >
>> >> 4. rx from device to cpu: device is responsible for triggering DMA
>> >> (driver does not do anything here)
>> >
>> > The driver *always* has to perform DMA mapping, regardless of whether
>> > the device is doing DMA reads or writes.  In this case, I think you're
>> > talking about a device doing a DMA write to system memory.  The driver
>> > must have previously performed the mapping and told the device "here's
>> > a DMA address where you can write things in the future."
>>
>> Is it that in this case(external device->system memory), unlike the
>> previous (system memory->external device ), the driver does not
>> generate the transaction, but the device should trigger it by
>> himself ?
>
> DMA by definition is a transaction generated by the device, either a
> DMA read (the device is reading from system memory) or a DMA write
> (the device is writing to system memory).
>
> Obviously in both cases the device needs to know an address to use,
> and the driver is responsible for performing a DMA mapping to get that
> address, and the driver conveys the address to the device before the
> DMA can happen.
>
>> >> 5. tx/rx completion: interrupt from DMA (in cpu)
>> >
>> > Typically a device generates an interrupt when DMA is complete.
>> >
>> >> 6. interrupt handler of section 5 above checks that device finished tx/rx.
>> >>
>> >> Is the above understanding correct ?
>> >> 1. As to 6, I am not sure about, is it that device knows that
>> >> transaction is finished (DMA is in soc , not device) ?
>> >
>> > This is device-specific.
>> >
>> >> 2. How does device triggers DMA from device to cpu, is it by simply
>> >> writing to BAR ?
>> >
>> > In most cases the device itself performs the DMA.  It sounds like
>> > you have a case where the DMA engine is not part of the device itself.
>> > I don't know how that would be programmed, except to say that the DMA
>> > engine presumably operates on DMA addresses, and something would still
>> > have to DMA-map buffers to obtain those.
>> >
>> > BARs are typically used for programming the device, i.e., the driver
>> > would use them.  BARs are not in the data path for a DMA transfer.
>>
>> I think I have some basic misunderstanding which makes me confused in
>> understanding DMA with PCI:
>> I first thought that Linux drivers should memory map DMA buffers
>> only when DMA engine is in the cpu host (  internal dma
>> engine->controller) From the above clarification it seems that
>> memory mapping DMA buffers should also be done even when DMA engine
>> is in external device ( controller ---|--pci--|--- dma engine of
>> external device).  Is that correct ? If yes - Why is it that DMA
>> buffers are needed in both cases ?
>
> In general DMA transactions do not go through the CPU virtual memory
> mappings (CPU page tables, TLBs, etc).  Sometimes DMA transactions go
> through an IOMMU that maps bus addresses to physical system memory
> addresses.  The DMA mapping APIs are defined to account for whatever
> address translations are required between the device and system
> memory.  Sometimes there is no translation, sometimes the DMA API
> needs to set up IOMMU mappings, etc.
>
>> The above is just for my understanding of linux pci drivers and DMA.
>> I Actually have another case: I need to use DMA-engine which is part
>> of the soc (armada chip) and is connected through PCI to FPGA.  It
>> seems that most drivers do not use this case. Is there any example
>> which does ?
>
> I don't know myself.  You might ask the ARM folks, who are more
> familiar with SoC architectures.  The basic problem is you need to
> know how the addresses generated by the DMA engine are interpreted:
> Are they translated via the same MMU used by CPU accesses?  Do they go
> through an IOMMU? etc.

I did find "Intel's mid dma" drivers, which use both pci device driver
and dmaengine:
http://elixir.free-electrons.com/linux/v3.8/source/drivers/dma/intel_mid_dma.c
Interestingly, it was removed in recent kernel.
I also did not find another pci device driver which use dmaengine.
So, seems that I am the first one who do this sort of thing...
Alternatively, I will check with Marvell's Armada staff if they
support dma with pci (probably not I guess) ?

Thank you,
Ran