Quoting Johannes Goetzfried
<Johannes.Goetzfried@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>:
This patch adds a x86_64/avx assembler implementation of the Serpent block
cipher. The implementation is very similar to the sse2 implementation and
processes eight blocks in parallel. Because of the new non-destructive three
operand syntax all move-instructions can be removed and therefore a little
performance increase is provided.
/* /me adds CPU with AVX to wishlist. */
<snip>
diff --git a/arch/x86/crypto/serpent_avx_glue.c
b/arch/x86/crypto/serpent_avx_glue.c
new file mode 100644
index 0000000..85ef6e7
--- /dev/null
+++ b/arch/x86/crypto/serpent_avx_glue.c
@@ -0,0 +1,949 @@
+/*
+ * Glue Code for AVX assembler versions of Serpent Cipher
+ *
+ * Copyright (C) 2012 Johannes Goetzfried
+ * <Johannes.Goetzfried@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
+ *
+ * Glue code based on twofish_avx_glue.c by:
Should be serpent_sse2_glue.c?
+ * Copyright (C) 2011 Jussi Kivilinna <jussi.kivilinna@xxxxxxxx>
+ *
<snip>
+}, {
+ .cra_name = "ecb(serpent)",
+ .cra_driver_name = "ecb-serpent-avx",
+ .cra_priority = 400,
serpent_sse2_glue.c has priority 400 too, so you should increase
priority here to 500.
...
Actually about duplicating glue code.. is it really needed? On x86_64,
both avx and sse2 versions process 8-blocks parallel and therefore
glue code could be easily shared (as is done in SHA1 SSSE3/AVX).
-Jussi
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html