On Wed, May 04, 2022 at 12:18:21AM +0000, Nathan Huckleberry wrote: > +.macro schoolbook1_iteration i xor_sum > + movups (16*\i)(MSG), %xmm0 > + .if (\i == 0 && \xor_sum == 1) > + pxor SUM, %xmm0 > + .endif > + vpclmulqdq $0x01, (16*\i)(KEY_POWERS), %xmm0, %xmm2 > + vpclmulqdq $0x00, (16*\i)(KEY_POWERS), %xmm0, %xmm1 > + vpclmulqdq $0x10, (16*\i)(KEY_POWERS), %xmm0, %xmm3 > + vpclmulqdq $0x11, (16*\i)(KEY_POWERS), %xmm0, %xmm4 > + vpxor %xmm2, MI, MI > + vpxor %xmm1, LO, LO > + vpxor %xmm4, HI, HI > + vpxor %xmm3, MI, MI > +.endm The 8 lines above are indented with spaces. They should use tabs, like everywhere else. > + * So our final computation is: T = T_1 : T_0 = g*(x) * P_0 V = V_1 : V_0 = > + * g*(x) * (P_1 + T_0) p(x) / x^{128} mod g(x) = P_3 + P_1 + T_0 + V_1 : P_2 + > + * P_0 + T_1 + V_0 This part is unreadable now -- it looks like you formatted it as regular text? The three equations should be on their own lines, like how it was before. > +__maybe_unused static const struct x86_cpu_id pcmul_cpu_id[] = { > + X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), > + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), > + {} > +}; > +MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); > + > +static int __init polyval_clmulni_mod_init(void) > +{ > + if (!x86_match_cpu(pcmul_cpu_id)) > + return -ENODEV; > + > + return crypto_register_shash(&polyval_alg); > +} > + > +static void __exit polyval_clmulni_mod_exit(void) > +{ > + crypto_unregister_shash(&polyval_alg); > +} This won't work as intended; it's registering the algorithm (and autoloading the module) if PCLMUL *or* AVX is available, rather than PCLMUL *and* AVX. I think the way to go is to just have X86_FEATURE_PCLMULQDQ in the table, like before, and add a check for boot_cpu_has(X86_FEATURE_AVX) in the init function. - Eric