Re: BUG: libkcapi tests trigger sleep-in-atomic bug in VMX code (ppc64)

Christophe LEROY <christophe.leroy@xxxxxx> · Tue, 21 Aug 2018 17:03:50 +0200

Le 21/08/2018 à 16:38, Ondrej Mosnáček a écrit :
ut 21. 8. 2018 o 16:18 Stephan Mueller <smueller@xxxxxxxxxx> napísal(a):
Am Dienstag, 21. August 2018, 14:48:11 CEST schrieb Ondrej Mosnáček:

Hi Ondrej, Marcelo,

(+Marcelo)

Looking at crypto/algif_skcipher.c, I can see that skcipher_recvmsg()
holds the socket lock the whole time and yet passes
CRYPTO_TFM_REQ_MAY_SLEEP to the cipher implementation. Isn't that
wrong?

I think you are referring to lock_sock(sk)?

If so, this should not be the culprit: the socket lock is in essence a mutex-
like operation with its own wait queue that it allowed to sleep. In
lock_sock_nested that is called by lock_sock it even has the call of
might_sleep which indicates that the caller may be put to sleep.

Looking into the code (without too much debugging) I see in the function
p8_aes_cbc_encrypt that is part of the stack trace the call to
preempt_disable() which starts an atomic context. The preempt_enable() is
invoked after the walk operation.

The preempt_disable increases the preempt_count. That counter is used by
in_atomic() to check whether we are in atomic context.

The issue is that blkcipher_walk_done may call crypto_yield() which then
invokes cond_resched if the implementation is allowed to sleep.

Indeed, you're right, the issue is actually in the vmx_crypto code. I
remember having looked at the 'ctr(aes)' implementation in there a few
days ago (I think I was trying to debug this very issue, but for some
reason I only looked at ctr(aes)...) and I didn't find any bug, so
that's why I jumped to suspecting the algif_skcipher code... I should
have double-checked :)

It turns out the 'cbc(aes)' (and actually also 'xts(aes)')
implementation is coded a bit differently and they both *do* contain
the sleep-in-atomic bug. I will try to fix them according to the
correct CTR implementation and send a patch.

CC: linuxppc-dev@xxxxxxxxxxxxxxxx <linuxppc-dev@xxxxxxxxxxxxxxxx>

Thanks,
Ondrej

@Marcelo: shouldn't be the sleep flag be cleared when entering the
preempt_disable section?

Ciao
Stephan