On Mon, Aug 05, 2024 at 10:41:21AM +0200, Sebastian Andrzej Siewior wrote: > On 2024-08-02 09:28:32 [-0700], Eric Biggers wrote: > > Hi Sebastian, > Hi Eric, > > > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > > > index cd37de5ec4046..be92e4c3f9c7f 100644 > > > --- a/arch/x86/crypto/aesni-intel_glue.c > > > +++ b/arch/x86/crypto/aesni-intel_glue.c > > > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > > > aes_gcm_update(key, le_ctr, ghash_acc, > > > walk.src.virt.addr, walk.dst.virt.addr, > > > nbytes, flags); > > > + kernel_fpu_end(); > > > err = skcipher_walk_done(&walk, 0); > > > + kernel_fpu_begin(); > > > /* > > > * The low word of the counter isn't used by the > > > * finalize, so there's no need to increment it here. > > > > Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt > > performance for everyone else? > > Every other instance in this file had a kernel_fpu_end/ begin() before > skcipher_walk_done() so I though was just missed by chance. No, it was intentional. See the comment above the first kernel_fpu_begin() in gcm_crypt(): /* * Since the AES-GCM assembly code requires that at least three assembly * functions be called to process any message (this is needed to support * incremental updates cleanly), to reduce overhead we try to do all * three calls in the same kernel FPU section if possible. We close the * section and start a new one if there are multiple data segments or if * rescheduling is needed while processing the associated data. */ kernel_fpu_begin(); > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > > can sleep. Have you checked for other instances of this same problem? It seems > > it would be quite common kernel-wide. > > kfree() can't have a might_sleep() because it does not qualify for this > since you can use it in softirq context for instance with an acquired > spinlockt_t on !RT which would trigger it. > On PREEMPT_RT interrupts are threaded, softirq is preemptible, > spintlock_t is a sleeping lock so all these things where a kfree() > would have been invoked in preempt-disable context on !PREEMPT_RT is > actually preemptible on PREEMPT_RT. > This is of course not true in cases where preemption is explicitly > disabled like in this case. WARN_ON(!preemptible()) then? If I add that to kfree(), it triggers from lots of other places. Are those problems on PREEMPT_RT too? What I am trying to get at is what debugging options do I need to detect issues like this. Is there really no option other than actually running a PREEMPT_RT kernel? I had tested this code with lots of debug options pre-merge and nothing came up. If there was something in CONFIG_SLUB_DEBUG, for example, I would have seen that, and you would never have had to deal with this issue at all as it would never have been introduced. - Eric