On Thu, Apr 04, 2024 at 04:53:12PM -0700, Dave Hansen wrote: > On 4/4/24 16:36, Eric Biggers wrote: > > 1. Never use zmm registers. > ... > > 4. Keep the proposed policy as the default behavior, but allow it to be > > overridden on the kernel command line. This would be a bit more flexible; > > however, most people don't change defaults anyway. > > > > When you write "Some folks will also surely disagree with the kernel policy > > implemented here", are there any specific concerns that you anticipate? > > Some people care less about the frequency throttling and only care about > max performance _using_ AVX512. > > > Note that Intel has acknowledged the zmm downclocking issues on Ice > > Lake and suggested that using ymm registers instead would be > > reasonable:> > https://lore.kernel.org/linux-crypto/e8ce1146-3952-6977-1d0e-a22758e58914@xxxxxxxxx/ > > > > If there is really a controversy, my vote is that for now we just go with option > > (1), i.e. drop this patch from the series. We can reconsider the issue when a > > CPU is released with better 512-bit support. > > (1) is fine with me. > > (4) would also be fine. But I don't think it absolutely _has_ to be a > boot-time switch. What prevents you from registering, say, > "xts-aes-vaes-avx10" and then doing: > > if (avx512_is_desired()) > xts-aes-vaes-avx10_512(...); > else > xts-aes-vaes-avx10_256(...); > > at runtime? > > Where avx512_is_desired() can be changed willy-nilly, either with a > command-line parameter or runtime knob. Sure, the performance might > change versus what was measured, but I don't think that's a deal breaker. > > Then if folks want to do fancy benchmarks or model/family checks or > whatever, they can do it in userspace at runtime. It's certainly possible for a single crypto algorithm (using "algorithm" in the crypto API sense of the word) to have multiple alternative code paths, and there are examples of this in arch/x86/crypto/. However, I think this is a poor practice, at least as the crypto API is currently designed, because it makes it difficult to test the different code paths. Alternatives are best handled by registering them as separate algorithms with different cra_priority values. Also, I forgot one property of my patch, which is that because I made the zmm_exclusion_list just decrease the priority of xts-aes-vaes-avx10_512 rather than skipping registering it, the change actually can be undone at runtime by increasing the priority of xts-aes-vaes-avx10_512 back to its original value. Userspace can do it using the "crypto user configuration API" (include/uapi/linux/cryptouser.h), specifically CRYPTO_MSG_UPDATEALG. Maybe that is enough configurability already? - Eric