Re: [PATCH 0/8] ARM: mvebu: Add support for RAID6 PQ offloading

Maxime Ripard <maxime.ripard@xxxxxxxxxxxxxxxxxx> · Tue, 2 Jun 2015 16:41:26 +0200

On Tue, May 26, 2015 at 09:31:03AM -0700, Dan Williams wrote:
> > If you mean, "give me a hand, you can start there", then yeah, I can
> > do that.
> >
> >> I'm not happy about not having had the time to do this rework myself.
> >> Linux is better off with this api deprecated.
> >
> > You're not talking about deprecating it, you're talking about removing
> > it entirely.
> 
> True, and adding more users makes that removal more difficult.  I'm
> willing to help out on the design and review for this work, I just
> can't commit to doing the implementation and testing.
> 
> I think it looks something like this:
> 
> At init time the raid456 driver probes for offload resources It can
> discover several scenarios:
> 
> 1/ "the ioatdma case": raid channels that have all the necessary
> operations (copy, xor/pq, xor/pq check).  In this case we'll never
> need to perform a channel switch.  Potentially the cpu never touches
> the stripe cache in this case and we can maintain a static dma mapping
> for the entire lifespan of a struct stripe_head.
> 
> 2/ "the channel switch case":  All the necessary offload resources are
> available but span multiple devices.  In this case we need to wait for
> channel1 to complete an operation before channel2 can start.  This
> case is complicated by the fact that different channels may need their
> own dma mappings.  In the simplest case channels can share the same
> mapping and raid456 needs to wait for channel completions.  I think we
> can do a better job than the async_tx api here as raid456 should
> probably poll for completions after each stripe processing batch.
> Taking an interrupt per channel-switch event seems like excessive
> overhead.
> 
> 3/ "the co-op case": We have a xor/pq offload resource, but copy and
> check operations require the cpu to touch the stripe cache.  In this
> case we need to use the dma_sync_*_for_cpu()/dma_sync_*_for_device()
> to pass buffers back and forth between device and cpu ownership.  This
> shares some of the complexity of waiting for completions with scenario
> 2.
> 
> Which scenario does your implementation fall into?  Maybe we can focus
> on that one and leave the other scenarios for other dmaengine
> maintainers to jump in an implement?

From my limited understanding of RAID and PQ computations, it would be
3 with a twist.

Our hardware controller supports xor and PQ, but the checks and
recovering data is not supported (we're not able to offload async_mult
and async_sum_product).

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
Attachment:
signature.asc

Description: Digital signature