On Mon, 1 Mar 2010 11:07:33 +0100 Thierry Reding <thierry.reding@xxxxxxxxxxxxxxxxx> wrote: > Hi, > > We use a design that incorporates a PCIe switch into an FPGA. Behind > the switch are a number of PCI-to-PCI (P2P) bridges, each with a > corresponding endpoint, like shown below. > > +--------------+ > | Root Complex | > Host +--------------+ > ============================|============================ > FPGA +-----------+ > | PCIe port | > +-----------+ > | > +--------------+--------------+ > | | | > +-----+ +-----+ +-----+ > | P2P | | P2P | | P2P | > +-----+ +-----+ +-----+ > | | | > +----------+ +----------+ +----------+ > | Endpoint | | Endpoint | | Endpoint | > +----------+ +----------+ +----------+ > > This setup works very well, except for bulk transfers to or from > individual endpoints because the FPGA cores often do not support any > kind of bus mastering. The FPGA cores, at least those we use, do not > even natively support PCI. These cores are interconnected using the > WISHBONE interface[1]. We connect the PCI port to the individual > WISHBONE cores using a special PCI-to-WISHBONE bridge, translating > PCI accesses to WISHBONE cycles. > > In order to fix the problem for bulk transfers we've been thinking > about implementing a sort of generic PCI DMA mastering framework. > This framework consists of two parts: one or more DMA masters within > the PCI hierarchy that can access PCI endpoints as well as system RAM > and some kernel driver infrastructure to control these DMA masters. > > For FPGA cores that do not support DMA transfers natively, their > driver can now use this framework to initiate bulk transfers to or > from system RAM or even to or from another core. The individual cores > no longer need any mastering capabilities. > > In practice, setting up such transfers would look something like > this: an endpoint driver queries the PCI DMA framework, passing to it > the source (and/or target?) memory region of future DMA transfers. > The framework will then lookup a matching DMA master and pass a > handle to it back to the driver, which can then use that handle to > queue new transfers. Drivers for DMA controllers register the masters > with the framework to make the functionality available to devices > mapped within a specific memory region. > > In our case the logical place for the DMA master would be within the > P2P bridges because they intrinsically know about the memory window > behind them already. > > To avoid duplication, perhaps this could somehow be integrated with > the existing dmaengine API. Though I am not sure about how to arrange > for the additional restrictions for specific memory windows. I haven't looked at the dmaengine API recently, but it does seem like you could extend the DMA mapping API to take a target device for P2P transactions. Those APIs generally have system memory as an implicit target or source for a given transaction, and the handle reflects that. To support P2P you'd need to add a few more calls with source/target device info like you suggest (they could just fall back to to system memory if those args were NULL to make implementation easier). -- Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html