Re: [RFC] generic PCI DMA master framework

Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> · Mon, 1 Mar 2010 09:49:34 -0800

On Mon, 1 Mar 2010 11:07:33 +0100
Thierry Reding <thierry.reding@xxxxxxxxxxxxxxxxx> wrote:

> Hi,
> 
> We use a design that incorporates a PCIe switch into an FPGA. Behind
> the switch are a number of PCI-to-PCI (P2P) bridges, each with a
> corresponding endpoint, like shown below.
> 
>                             +--------------+
>                             | Root Complex |
>         Host                +--------------+
>         ============================|============================
>         FPGA                  +-----------+
>                               | PCIe port |
>                               +-----------+
>                                     |
>                      +--------------+--------------+
>                      |              |              |
>                   +-----+        +-----+        +-----+
>                   | P2P |        | P2P |        | P2P |
>                   +-----+        +-----+        +-----+
>                      |              |              |
>                +----------+    +----------+   +----------+
>                | Endpoint |    | Endpoint |   | Endpoint |
>                +----------+    +----------+   +----------+
> 
> This setup works very well, except for bulk transfers to or from
> individual endpoints because the FPGA cores often do not support any
> kind of bus mastering. The FPGA cores, at least those we use, do not
> even natively support PCI. These cores are interconnected using the
> WISHBONE interface[1]. We connect the PCI port to the individual
> WISHBONE cores using a special PCI-to-WISHBONE bridge, translating
> PCI accesses to WISHBONE cycles.
> 
> In order to fix the problem for bulk transfers we've been thinking
> about implementing a sort of generic PCI DMA mastering framework.
> This framework consists of two parts: one or more DMA masters within
> the PCI hierarchy that can access PCI endpoints as well as system RAM
> and some kernel driver infrastructure to control these DMA masters.
> 
> For FPGA cores that do not support DMA transfers natively, their
> driver can now use this framework to initiate bulk transfers to or
> from system RAM or even to or from another core. The individual cores
> no longer need any mastering capabilities.
> 
> In practice, setting up such transfers would look something like
> this: an endpoint driver queries the PCI DMA framework, passing to it
> the source (and/or target?) memory region of future DMA transfers.
> The framework will then lookup a matching DMA master and pass a
> handle to it back to the driver, which can then use that handle to
> queue new transfers. Drivers for DMA controllers register the masters
> with the framework to make the functionality available to devices
> mapped within a specific memory region.
> 
> In our case the logical place for the DMA master would be within the
> P2P bridges because they intrinsically know about the memory window
> behind them already.
> 
> To avoid duplication, perhaps this could somehow be integrated with
> the existing dmaengine API. Though I am not sure about how to arrange
> for the additional restrictions for specific memory windows.

I haven't looked at the dmaengine API recently, but it does seem like
you could extend the DMA mapping API to take a target device for P2P
transactions.  Those APIs generally have system memory as an implicit
target or source for a given transaction, and the handle reflects
that.  To support P2P you'd need to add a few more calls with
source/target device info like you suggest (they could just fall back to
to system memory if those args were NULL to make implementation easier).

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html