On 2024/12/16 8:35, Vinod Koul wrote: > Hi Niklas, > > On 13-12-24, 17:59, Niklas Cassel wrote: >> Hello Vinod, >> >> I am a bit confused about the usage of the dmaengine API, and I hope that you >> could help make me slightly less confused :) > > Sure thing! > >> If you look at the nvmet_pciep_epf_dma_transfer() function below, it takes a >> mutex around the dmaengine_slave_config(), dmaengine_prep_slave_single(), >> dmaengine_submit(), dma_sync_wait(), and dmaengine_terminate_sync() calls. >> >> I really wish that we would remove this mutex, to get better performance. >> >> >> If I look at e.g. the drivers/dma/dw-edma/dw-edma-core.c driver, I can see >> that dmaengine_prep_slave_single() (which will call >> device_prep_slave_sg(.., .., 1, .., .., ..)) allocates a new >> dma_async_tx_descriptor for each function call. >> >> I can see that device_prep_slave_sg() (dw_edma_device_prep_slave_sg()) will >> call dw_edma_device_transfer() which will call vchan_tx_prep(), which adds >> the descriptor to the tail of a list. >> >> I can also see that dw_edma_done_interrupt() will automatically start the >> transfer of the next descriptor (using vchan_next_desc()). >> >> So this looks like it is supposed to be asynchronous... however, if we simply >> remove the mutex, we get IOMMU errors, most likely because the DMA writes to >> an incorrect address. >> >> It looks like this is because dmaengine_prep_slave_single() really requires >> dmaengine_slave_config() for each transfer. (Since we are supplying a src_addr >> in the sconf that we are supplying to dmaengine_slave_config().) >> >> (i.e. we can't call dmaengine_slave_config() while a DMA transfer is active.) >> >> So while this API is supposed to be async, to me it looks like it can only >> be used in a synchronous manner... But that seems like a really weird design. >> >> Am I missing something obvious here? > > Yes, I feel nvme being treated as slave transfer, which it might not be. > This API was designed for peripherals like i2c/spi etc where we have a > hardware address to read/write to. So the dma_slave_config would pass on > the transfer details for the peripheral like address, width of fifo, > depth etc and these are setup config, so call once for a channel and then > prepare the descriptor, submit... and repeat of prepare and submit ... > > I suspect since you are passing an address which keep changing in the > dma_slave_config, you need to guard that and prep_slave_single() call, > as while preparing the descriptor driver would lookup what was setup for > the configuration. > > I suggest then use the prep_memcpy() API instead and pass on source and > destination, no need to lock the calls... Vinod, Thank you for the information. However, I think we can use this only if the DMA controller driver implements the device_prep_dma_memcpy operation, no ? In our case, the DWC EDMA driver does not seem to implement this. -- Damien Le Moal Western Digital Research