On Thu, Jan 16, 2025 at 10:17:25AM +0530, Manivannan Sadhasivam wrote: > On Thu, Jan 02, 2025 at 03:23:14PM +0100, Niklas Cassel wrote: > > Hello Mani, Vinod, > > > > On Thu, Jan 02, 2025 at 12:34:04PM +0530, Manivannan Sadhasivam wrote: > > > On Tue, Dec 31, 2024 at 08:33:57PM +0100, Niklas Cassel wrote: > > > > > > > > I have some patches that adds DMA_MEMCPY to dw-edma, but I'm not sure if the DWC eDMA hardware supports having both src and dst as PCI addresses, or if only one of them can be a PCI address (with the other one being a local address). > > > > > > > > If only one of them can be a PCI address, then I'm not sure if your suggested patch is correct. > > > > > > > > > > I don't see why that would be an issue. DMA_MEMCPY is independent of PCI/local > > > addresses. If a dmaengine driver support doing MEMCPY, then the dma cap should > > > be sufficient. As you said, if a controller supports both SLAVE and MEMCPY, the > > > test currently errors out, which is wrong. > > > > While I am okay with your suggested change to pci-epf-test.c: > > > >- if (epf_test->dma_private) { > > > >+ if (!dma_has_cap(DMA_MEMCPY, epf_test->dma_chan_tx->device->cap_mask)) { > > > > Since this will ensure that a DMA driver implementing DMA_MEMCPY, > > which cannot be shared (has DMA_PRIVATE set), will not error out. > > > > > > What I'm trying to explain is that in: > > https://lore.kernel.org/linux-pci/Z2BW4CjdE1p50AhC@vaman/ > > https://lore.kernel.org/linux-pci/20241217090129.6dodrgi4tn7l3cod@thinkpad/ > > > > Vinod (any you) suggested that we should add support for prep_memcpy() > > (which implies also setting cap DMA_MEMCPY) in the dw-edma DMA driver. > > > > However, from section "6.3 Using the DMA" in the DWC databook, > > the DWC eDMA hardware only supports: > > - Transfer (copy) of a block of data from local memory to remote memory. > > - Transfer (copy) of a block of data from remote memory to local memory. > > > > > > Currently, we have: > > https://github.com/torvalds/linux/blob/v6.13-rc5/include/linux/dmaengine.h#L843-L844 > > https://github.com/torvalds/linux/blob/v6.13-rc5/drivers/dma/dw-edma/dw-edma-core.c#L215-L231 > > > > Where we can expose per-channel capabilities, so we set MEM_TO_DEV/DEV_TO_MEM > > per channel, however, these are returned in a struct dma_slave_caps *caps, > > so this is AFAICT only for DMA_SLAVE, not for DMA_MEMCPY. > > > > Looking at: > > https://github.com/torvalds/linux/blob/v6.13-rc5/include/linux/dmaengine.h#L975-L979 > > it seems that DMA_MEMCPY is always assumed to be MEM_TO_MEM. > > > > To me, it seems that we would either need a new dma_transaction_type (e.g. DMA_COPY) > > where we can set dir: > > MEM_TO_DEV, DEV_TO_MEM, or DEV_TO_DEV. (dw-edma would not support DEV_TO_DEV.) > > > > Or, if we should stick with DMA_MEMCPY, we would need another way of telling > > client drivers that only src or dst can be a remote address. > > > > Until this is solved, I think I will stop my work on adding DMA_MEMCPY to the > > dw-edma driver. > > > > I think your concern is regarding setting the DMA transfer direction for MEMCPY, > right? And you are saying that even if we use tx/rx channels, currently we > cannot set DEV_TO_DEV like directions? > > But I'm somewhat confused about what is blocking you from adding MEMCPY support > to the dw-edma driver since that driver cannot support DEV_TO_DEV. In your WIP > driver, you were setting the direction based on the channel. Isn't that > sufficient enough? What I did in the WIP driver patches was to set the direction to either DEV_TO_MEM, or MEM_TO_DEV. But that is wrong, since the prep_memcpy() API doesn't take a direction. In fact, it appears that memcpy is always assumed to be MEM_TO_MEM: https://github.com/torvalds/linux/blob/v6.13-rc7/include/linux/dmaengine.h#L74 E.g. the dw-edma driver cannot have both src address and dst address as a local address (MEM_TO_MEM), so using DMA_MEMCPY API feels totally wrong. Either dst or src has to be a local address (MEM), and the one that isn't a local address has to be a PCI address (DEV). Sure, calling a PCI address DEV might not be 100% correct, but I cannot think of a better way... We also cannot treat a PCI address as MEM, as dw-edma cannot do PCI to PCI transfers. I think the best way forward would be to create a new _prep_slave_memcpy() or similar, that does take a direction, and thus does not require dmaengine_slave_config() to be called before every _prep_slave_memcpy() call, since that is basically what is not allowing us to have multiple transactions outstanding in parallel. Kind regards, Niklas