Hi Frank, On 29/09/22 3:08 am, Frank Li wrote:
ALL: Recently some important PCI EP function patch already merged. Especially DWC EDMA support. PCIe EDMA have nice feature, which can read/write all PCI host memory regardless EP side PCI memory map windows size. Pci-epf-vntb.c also merged into mainline. And part of vntb msi patch already merged. https://lore.kernel.org/imx/86mtaj7hdw.wl-maz@xxxxxxxxxx/T/#m35546867af07735c1070f596d653a2666f453c52 Although msi can improve transfer latency, the transfer speed still quite slow because DMA have not supported yet. I plan continue to improve transfer speed. But I find some fundamental limitation at original framework, which can’t use EDMA 100% benefits.
By framework, you mean limitations with pci-epf-vntb right?
After research some old thread: https://lore.kernel.org/linux-pci/20200702082143.25259-1-kishon@xxxxxx/ https://lore.kernel.org/linux-pci/9f8e596f-b601-7f97-a98a-111763f966d1@xxxxxx/T/ Some RDMA document and https://github.com/ntrdma/ntrdma-ext I think the solution, which based on haotian wang will be best one.
why?
┌─────────────────────────────────┐ ┌──────────────┐ │ │ │ │ │ │ │ │ │ VirtQueue RX │ │ VirtQueue │ │ TX ┌──┐ │ │ TX │ │ ┌─────────┐ │ │ │ │ ┌─────────┐ │ │ │ SRC LEN ├─────┐ ┌──┤ │◄────┼───┼─┤ SRC LEN │ │ │ ├─────────┤ │ │ │ │ │ │ ├─────────┤ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ├─────────┤ │ │ │ │ │ │ ├─────────┤ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └─────────┘ │ │ └──┘ │ │ └─────────┘ │ │ │ │ │ │ │ │ RX ┌───┼──┘ TX │ │ RX │ │ ┌─────────┐ │ │ ┌──┐ │ │ ┌─────────┐ │ │ │ │◄┘ └────►│ ├─────┼───┼─┤ │ │ │ ├─────────┤ │ │ │ │ ├─────────┤ │ │ │ │ │ │ │ │ │ │ │ │ ├─────────┤ │ │ │ │ ├─────────┤ │ │ │ │ │ │ │ │ │ │ │ │ └─────────┘ │ │ │ │ └─────────┘ │ │ virtio_net └──┘ │ │ virtio_net │ │ Virtual PCI BUS EDMA Queue │ │ │ ├─────────────────────────────────┤ │ │ │ PCI EP Controller with eDMA │ │ PCI Host │ └─────────────────────────────────┘ └──────────────┘ Basic idea is 1. Both EP and host probe virtio_net driver 2. There are two queues, one is EP side(EQ), the other is Host side. 3. EP side epf driver map Host side’s queue into EP’s space. , Called HQ. 4. One working thread a. pick one TX from EQ and RX from HQ, combine and generate EDMA request, and put into DMA TX queue. b. Pick one RX from EQ and TX from HQ, combine and generate EDMA request, and put into DMA RX queue. 5. EDMA done irq will mark related item in EP and HQ finished. The whole transfer is zero copied and use DMA queue. RDMA have similar idea and more coding efforts.
My suggestion would be to pick a cleaner solution with the right abstractions and not based on coding efforts.
I think Kishon Vijay Abraham I prefer use vhost, but I don’t know how to build a queue at host side.
Not sure what you mean by host side here. But the queue would be only on virtio frontend (virtio-net running on PCIe RC) and PCIe EP would access the front-end's queue.
NTB transfer just do one directory EDMA transfer (DMA write) because Read actually local memory to local memory. Any comments about overall solution?
I would suggest you to go through the comments received on Haotian Wang patch and suggest what changes you are proposing.
Thanks, Kishon