Hi Jason, On Tue, Jan 11, 2022 at 2:49 PM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > Re-wind the loops entirely on kernels optimized for code size. This is > really not good at all performance-wise. But on m68k, it shaves off 4k > of code size, which is apparently important. On arm32: add/remove: 1/0 grow/shrink: 0/1 up/down: 160/-4212 (-4052) Function old new delta blake2s_sigma - 160 +160 blake2s_compress_generic 4872 660 -4212 Total: Before=9846148, After=9842096, chg -0.04% On arm64: add/remove: 1/2 grow/shrink: 0/1 up/down: 160/-4584 (-4424) Function old new delta blake2s_sigma - 160 +160 e843419@0710_00007634_e8a0 8 - -8 e843419@0441_0000423a_178c 8 - -8 blake2s_compress_generic 5088 520 -4568 Total: Before=32800278, After=32795854, chg -0.01% > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx> For the size reduction: Tested-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds