Re: Re[2]: [PATCH 02/11][v3] async_tx: add support for asynchronous GF multiplication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 16, 2009 at 4:41 AM, Yuri Tikhonov <yur@xxxxxxxxxxx> wrote:
>> I don't think this will work as we will be mixing Q into the new P and
>> P into the new Q.  In order to support (src_cnt > device->max_pq) we
>> need to explicitly tell the driver that the operation is being
>> continued (DMA_PREP_CONTINUE) and to apply different coeffeicients to
>> P and Q to cancel the effect of including them as sources.
>
>  With DMA_PREP_ZERO_P/Q approach, the Q isn't mixed into new P, and P
> isn't mixed into new Q. For your example of max_pq=4:
>
>  p, q = PQ(src0, src1, src2, src3, src4, COEF({01}, {02}, {04}, {08}, {10}))
>
>  with the current implementation will be split into:
>
>  p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})
>  p`,q` = PQ(src4, COEF({10}))
>
>  which will result to the following:
>
>  p = ((dma_flags & DMA_PREP_ZERO_P) ? 0 : old_p) + src0 + src1 + src2 + src3
>  q = ((dma_flags & DMA_PREP_ZERO_Q) ? 0 : old_q) + {01}*src0 + {02}*src1 + {04}*src2 + {08}*src3
>
>  p` = p + src4
>  q` = q + {10}*src4
>

Huh?  Does the ppc440spe engine have some notion of flagging a source
as old_p/old_q?  Otherwise I do not see how the engine will not turn
this into:

p` = p + src4 + q
q` = q + {10}*src4 + {x}*p

I think you missed the fact that we have passed p and q back in as
sources.  Unless we have multiple p destinations and multiple q
destinations, or hardware support for continuations I do not see how
you can guarantee this split.

>  I'm afraid that the difference (13/4, 125/32) is very significant, so
> getting rid of DMA_PREP_ZERO_P/Q will eat most of the improvement
> which could be achieved with the current approach.

Data corruption is a slightly higher cost :-).

>
>>  but at this point I do not see a cleaner alternatve for engines like iop13xx.
>
>  I can't find any description of iop13xx processors at Intel's
> web-site, only 3xx:
>
> http://www.intel.com/design/iio/index.htm?iid=ipp_embed+embed_io
>
>  So, it's hard for me to do any suggestions. I just wonder - doesn't
> iop13xx allow users to program destination addresses into the sources
> fields of descriptors?

Yes it does, but the engine does not know it is a destination.

Take a look at page 496 of the following and tell me if you come to a
different conclusion.
http://download.intel.com/design/iio/docs/31503602.pdf

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux