Re: [PATCH, v3] MIPS: lib: csum_partial: more instruction paral

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/20/2014 01:09 PM, chenj wrote:
> Computing sum introduces true data dependency. This patch removes some
> true data depdendencies, hence instruction level parallelism is
> improved.
> 
> This patch brings at most 50% csum performance gain on Loongson 3a
> processor in our test.
> 
> One example about how this patch works is in CSUM_BIGCHUNK1:
> // ** original **    vs    ** patch applied **
>     ADDC(sum, t0)           ADDC(t0, t1)
>     ADDC(sum, t1)           ADDC(t2, t3)
>     ADDC(sum, t2)           ADDC(sum, t0)
>     ADDC(sum, t3)           ADDC(sum, t2)
> 
> In the original implementation, each ADDC(sum, ...) references the sum
> value updated by previous ADDC.
> 
> With patch applied, the first two ADDC operations are independent,
> hence can be executed simultaneously if possible.
> 
> Another example is in the "copy and sum calculating" chunk:
> // ** original **    vs    ** patch applied **
>     STORE(t0, UNIT(0)...    STORE(t0, UNIT(0)...
>     ADDC(sum, t0)           ADDC(t0, t1)
>     STORE(t1, UNIT(1)...    STORE(t1, UNIT(1)...
>     ADDC(sum, t1)           ADDC(sum, t0)
>     STORE(t2, UNIT(2)...    STORE(t2, UNIT(2)...
>     ADDC(sum, t2)           ADDC(t2, t3)
>     STORE(t3, UNIT(3)...    STORE(t3, UNIT(3)...
>     ADDC(sum, t3)           ADDC(sum, t2)
> 
> With patch applied, the second and third ADDC are independent.

Hi chenj,

You forgot to sign-off your patch

-- 
markos


[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux