Re: [PATCH 0/7] mv_xor: Cleanups and improvements

Ezequiel Garcia <ezequiel.garcia@xxxxxxxxxxxxxxxxxx> · Tue, 16 Sep 2014 09:10:26 +0100

Hi Emilio,

On 12 Sep 08:52 PM, Emilio López wrote:
> El 27/08/14 a las 10:52, Ezequiel Garcia escibió:
> >Hi everyone,
> >
> >This series is the result of Lior Amsalem's nice work to improve the mv_xor
> >engine. The most interesting patches are:
> >
> >   * Patch 2, which reduces code duplication by implementing the DMA_MEMCPY
> >     in terms of a DMA_XOR.
> >
> >     This not only simplifies the driver, but also improves performance.
> >     The controller is now able to put DMA_MEMCPY and DMA_XOR transactions
> >     in the same chain, given they are both handled as XOR commands.
> 
> What happens if that is not the case? In other words, if you're only issuing
> memcpy transactions, is the throughput better, the same, or worse when
> compared to the old code? An easy test for this could be dmatest doing
> memcpy (don't forget to turn verify off)
> 

No, I haven't done much performance measurements with memcpy alone.

> In relation to that, what's the Real World(tm) use case for these
> operations? Is it common to have them interleaved? It may be worth
> considering this and documenting it on the commit messages when doing these
> sorts of optimizations.
> 

The sole user of the XOR DMAengine is the async_tx API, which in turn is only
used by the RAID stack. This is the Real World use case, and it's the use case
these patches optimize.

Maybe it would be worth doing some more detailed performance measurements,
and add them to the commit logs.

On the other side, the code cleanup and the diffstat speak for itself,
and I can't see much chances of a performance regression.

> >   * Patch 3, which removes the multi-slot support entirely. Such support
> >     is not used and it only makes the code much more complex for no reason.
> >
> >   * Patch 5, which enables the end-of-descriptor interrupt only when needed.
> >     For DMA_XOR and DMA_MEMCPY operations the end-of-chain is used. This
> >     reduces the number of interrupts.
> >
> >   * Patch 7, which adds support for the DMA_INTERRUPT operation. This is
> >     implemented as a XOR operation from a dummy source to a dummy destination.
> 
> I see you used char arrays on your struct for this; don't you need some
> special alignment for the engine?

Nope, no special alignment required.

> Have you tried with a 0-sized operation?

Hm, what do you mean by that? The DMA_INTERRUPT is just a minimum-sized, single
source XOR from a dummy source to a dummy destination. The minimum size is
a hardware constraint, so there's no way it can be 0-sized.

Thanks a lot for the review!
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html