Re: dmanegine discussions during Plumbers

Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> · Thu, 03 Nov 2016 18:16:56 +0200

Hello,

Here are the notes I took during the session. As I was taking part in the 
discussion I'm sure I have missed a few points, so please add any missing 
information or correct any mistake I would have made.

- Allocation of TX async descriptors

It would be useful to decouple allocation from prepare, as prepare needs to be
called from interrupt context, which requires allocation from interrupt pool.
The room reached a consensus that this a real problem that should be addressed
by a new API.

This is also related to runtime PM that is currently painful with the existing
DMA engine API. (TBD: how ?)

- Error reporting

UART implementation that doesn't provide interrupt on timeout. The DMA engine
provides that interrupt, and we need to get the residue at timeout time.

This is solved by

commit f067025bc676ba8d18fba5f959598339e39b86db
Author: Dave Jiang <dave.jiang@xxxxxxxxx>
Date:   Wed Jul 20 13:13:50 2016 -0700

    dmaengine: add support to provide error result from a DMA transation

The commit also supports DMA engine based network or USB devices, which
don't know in advance how much data will be transferred in the receive
direction. In that case the slave driver has to prepare a large-enough
transaction, and the completion handler reports the actual packet size through
the residue (assuming the hardware supports early termination of transactions
when packet reception completes).

In those use cases, we don't know in advance how much data will be received.

- Per-channel capabilities

Most DMA engines have identical interchangeable channels. The API has been
designed around that assumption. However, some DMA engines have different
capabilities per channel. This isn't exposed by the current API.

One possible solution would be to register multiple DMA engines, each with a
set of identical channels.

The capabilities API (dma_get_slave_caps) is channel-based, so it could also 
be refactored in the core to expose per-channel capabilities.

- API refactoring

The transaction prepare API is possible too high-level in the sense that it
combines multiple concepts (e.g. cyclic + physically contiguous, non-cyclic
+ sglist based, ...). Many DMA engines could support other combinations. We
could use a lower-level API that doesn't combine those concepts, thus enabling
more use cases.

We would need to handle more in core code. We currently leave the whole
implementation to drivers, which leads to slightly different behaviours 
between drivers. As recent DMA engines tend to use small segments and chain 
them, the framework could handle splitting of transactions in small segments, 
and ask drivers to allocate those segments and chain them. When CPU-based 
chaining is needed the framework could implement helper functions for that 
purpose. This would lead to a more coherent behaviour.

With an API where drivers only handle small segments, residue calculation 
could be handled mostly by the core, with a single driver operation to 
retrieve the DMA pointer value for the current segment.

-- 
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html