Re: [PATCH v2 05/20] crypto: mips/chacha - import accelerated 32r2 code from Zinc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>:

<snip>

Hi Ard,

Thanks a lot for taking the time to double check this. I think it
would be nice to be able to expose xchacha12 like we do on other
architectures.

Note that for xchacha, I also added a hchacha_block() routine based on
your code (with the round count as the third argument) [0]. Please let
me know if you see anything wrong with that.


+.globl hchacha_block
+.ent hchacha_block
+hchacha_block:
+ .frame $sp, STACK_SIZE, $ra
+
+ addiu $sp, -STACK_SIZE
+
+ /* Save s0-s7 */
+ sw $s0, 0($sp)
+ sw $s1, 4($sp)
+ sw $s2, 8($sp)
+ sw $s3, 12($sp)
+ sw $s4, 16($sp)
+ sw $s5, 20($sp)
+ sw $s6, 24($sp)
+ sw $s7, 28($sp)

We only have to preserve the used s registers.
Currently X11 to X15 are using the registers s6 down to s2.

But by shuffling/redefine the needed registers, so that we use all the
non-preserve registers, I can reduce the used s registers to one.

Registers we don't use and don't have to preserve are a3, at and v0.
Also STATE(a0) can be reused because we only need that pointer while loading the
values from memory.

So:

#undef X12
#undef X13
#undef X14
#undef X15

#define X12    $a3
#define X13    $at
#define X14    $v0
#define X15    STATE

And save X11(s6) on the stack.

See the full code here [0].

For the rest the code looks good!

Greats,

René

[0]: https://github.com/vDorst/wireguard/commit/562a516ae3b282b32f57d3239369360bc926df60


+
+ lw X0, 0(STATE)
+ lw X1, 4(STATE)
+ lw X2, 8(STATE)
+ lw X3, 12(STATE)
+ lw X4, 16(STATE)
+ lw X5, 20(STATE)
+ lw X6, 24(STATE)
+ lw X7, 28(STATE)
+ lw X8, 32(STATE)
+ lw X9, 36(STATE)
+ lw X10, 40(STATE)
+ lw X11, 44(STATE)
+ lw X12, 48(STATE)
+ lw X13, 52(STATE)
+ lw X14, 56(STATE)
+ lw X15, 60(STATE)
+
+.Loop_hchacha_xor_rounds:
+ addiu $a2, -2
+ AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 16);
+ AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 12);
+ AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 8);
+ AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 7);
+ AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 16);
+ AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 12);
+ AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 8);
+ AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 7);
+ bnez $a2, .Loop_hchacha_xor_rounds
+
+ sw X0, 0(OUT)
+ sw X1, 4(OUT)
+ sw X2, 8(OUT)
+ sw X3, 12(OUT)
+ sw X12, 16(OUT)
+ sw X13, 20(OUT)
+ sw X14, 24(OUT)
+ sw X15, 28(OUT)
+
+ /* Restore used registers */
+ lw $s0, 0($sp)
+ lw $s1, 4($sp)
+ lw $s2, 8($sp)
+ lw $s3, 12($sp)
+ lw $s4, 16($sp)
+ lw $s5, 20($sp)
+ lw $s6, 24($sp)
+ lw $s7, 28($sp)
+
+ addiu $sp, STACK_SIZE
+ jr $ra
+.end hchacha_block
+.set at


[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=wireguard-crypto-library-api-v3&id=cc74a037f8152d52bd17feaf8d9142b61761484f








[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux