Re: [PATCH v4 17/18] nvmet: New NVMe PCI endpoint target driver

Vinod Koul <vkoul@xxxxxxxxxx> · Tue, 17 Dec 2024 10:57:02 +0530

On 16-12-24, 11:12, Damien Le Moal wrote:
> On 2024/12/16 8:35, Vinod Koul wrote:
> > Hi Niklas,
> > 
> > On 13-12-24, 17:59, Niklas Cassel wrote:
> >> Hello Vinod,
> >>
> >> I am a bit confused about the usage of the dmaengine API, and I hope that you
> >> could help make me slightly less confused :)
> > 
> > Sure thing!
> > 
> >> If you look at the nvmet_pciep_epf_dma_transfer() function below, it takes a
> >> mutex around the dmaengine_slave_config(), dmaengine_prep_slave_single(),
> >> dmaengine_submit(), dma_sync_wait(), and dmaengine_terminate_sync() calls.
> >>
> >> I really wish that we would remove this mutex, to get better performance.
> >>
> >>
> >> If I look at e.g. the drivers/dma/dw-edma/dw-edma-core.c driver, I can see
> >> that dmaengine_prep_slave_single() (which will call
> >> device_prep_slave_sg(.., .., 1, .., .., ..)) allocates a new
> >> dma_async_tx_descriptor for each function call.
> >>
> >> I can see that device_prep_slave_sg() (dw_edma_device_prep_slave_sg()) will
> >> call dw_edma_device_transfer() which will call vchan_tx_prep(), which adds
> >> the descriptor to the tail of a list.
> >>
> >> I can also see that dw_edma_done_interrupt() will automatically start the
> >> transfer of the next descriptor (using vchan_next_desc()).
> >>
> >> So this looks like it is supposed to be asynchronous... however, if we simply
> >> remove the mutex, we get IOMMU errors, most likely because the DMA writes to
> >> an incorrect address.
> >>
> >> It looks like this is because dmaengine_prep_slave_single() really requires
> >> dmaengine_slave_config() for each transfer. (Since we are supplying a src_addr
> >> in the sconf that we are supplying to dmaengine_slave_config().)
> >>
> >> (i.e. we can't call dmaengine_slave_config() while a DMA transfer is active.)
> >>
> >> So while this API is supposed to be async, to me it looks like it can only
> >> be used in a synchronous manner... But that seems like a really weird design.
> >>
> >> Am I missing something obvious here?
> > 
> > Yes, I feel nvme being treated as slave transfer, which it might not be.
> > This API was designed for peripherals like i2c/spi etc where we have a
> > hardware address to read/write to. So the dma_slave_config would pass on
> > the transfer details for the peripheral like address, width of fifo,
> > depth etc and these are setup config, so call once for a channel and then
> > prepare the descriptor, submit... and repeat of prepare and submit ...
> > 
> > I suspect since you are passing an address which keep changing in the
> > dma_slave_config, you need to guard that and prep_slave_single() call,
> > as while preparing the descriptor driver would lookup what was setup for
> > the configuration.
> > 
> > I suggest then use the prep_memcpy() API instead and pass on source and
> > destination, no need to lock the calls...
> 
> Vinod,
> 
> Thank you for the information. However, I think we can use this only if the DMA
> controller driver implements the device_prep_dma_memcpy operation, no ?
> In our case, the DWC EDMA driver does not seem to implement this.

It should be added in that case.

Before that, the bigger question is, should nvme be slave transfer or
memcpy.. Was driver support the reason why the slave transfer was used here...?

As i said, slave is for peripherals which have a static fifo to
send/receive data from, nvme sounds like a memory transfer to me, is
that a right assumption?

-- 
~Vinod