*ping* Anyone? -----Original Message----- From: linux-crypto-owner@xxxxxxxxxxxxxxx <linux-crypto-owner@xxxxxxxxxxxxxxx> On Behalf Of Gary R Hook Sent: Thursday, August 15, 2019 5:32 PM To: linux-crypto@xxxxxxxxxxxxxxx Subject: Crypto driver callback problem Context: We've run into a possible locking issue when using the AMD CCP device for crypto offload, with the new extra tests, and with lock debugging enabled. Pre-reqs: Enable the crypto self-tests and extra tests Build the crypto testing module (tcrypt) Enable the kernel hacking/lock debugging option "prove locking correctness" (CONFIG_PROVE_LOCKING=y) Problem: After booting this kernel, and loading our ccp-crypto module, run the tests with "modprobe tcrypt" and we get the dmesg output shown following. With some investigation we discovered that in the situation being reported here, the callback is executed as a tasklet, where we're not allowed to sleep. The called routine began its crypto operation with the MAY_SLEEP flag set, and allocates memory using GFP_KERNEL. The test itself runs in a typical context, and may sleep; the callback, however, runs as a tasklet yet thinks it can sleep based on the CRYPTO_TFM_REQ_MAY_SLEEP flag set in the request. Drivers for crypto are required (as we understand it) to use a tasklet approach for callbacks, which is needed (for one example) to support IPSec. (N.B. we did some poking around in other, similar crypto drivers, and all appear to be using tasklets.) We're looking for the best approach to resolve this. Thoughts? Gary [ 58.815592] alg: skcipher: skipping comparison tests for ofb(aes) because ofb(aes-generic) is unavailable [ 59.105437] alg: skcipher: skipping comparison tests for cfb(aes) because cfb(aes-generic) is unavailable [ 64.024347] alg: hash: failed to allocate transform for streebog256: -2 [ 64.039851] alg: hash: failed to allocate transform for streebog512: -2 [ 67.698789] alg: hash: failed to allocate transform for hmac(streebog256): -2 [ 67.720735] alg: hash: failed to allocate transform for hmac(streebog512): -2 [ 68.803745] ================================ [ 68.808515] WARNING: inconsistent lock state [ 68.813284] 5.3.0-rc2-oopschk+ #62 Tainted: G E [ 68.819796] -------------------------------- [ 68.824565] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 68.831272] ksoftirqd/56/299 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 68.837496] 00000000334d2af0 (fs_reclaim){+.?.}, at: fs_reclaim_acquire.part.101+0x5/0x30 [ 68.846637] {SOFTIRQ-ON-W} state was registered at: [ 68.852078] lock_acquire+0xa3/0x180 [ 68.856167] fs_reclaim_acquire.part.101+0x29/0x30 [ 68.861614] fs_reclaim_acquire+0x19/0x20 [ 68.866187] slab_pre_alloc_hook+0x19/0x70 [ 68.870847] __kmalloc_node+0x51/0x1f0 [ 68.875134] alloc_cpumask_var_node+0x1f/0x30 [ 68.880096] zalloc_cpumask_var+0x14/0x20 [ 68.884673] native_smp_prepare_cpus+0xd7/0x28e [ 68.889829] kernel_init_freeable+0x136/0x2d0 [ 68.894791] kernel_init+0xe/0x110 [ 68.898690] ret_from_fork+0x27/0x50 [ 68.902776] irq event stamp: 53764 [ 68.906577] hardirqs last enabled at (53764): [<ffffffff8e4d1b37>] slab_free_freelist_hook+0xd7/0x1c0 [ 68.916968] hardirqs last disabled at (53763): [<ffffffff8e4d1b18>] slab_free_freelist_hook+0xb8/0x1c0 [ 68.927363] softirqs last enabled at (53752): [<ffffffff8f00035e>] __do_softirq+0x35e/0x3fe [ 68.936791] softirqs last disabled at (53757): [<ffffffff8e2a0465>] run_ksoftirqd+0x35/0x50 [ 68.946115] other info that might help us debug this: [ 68.953403] Possible unsafe locking scenario: [ 68.960004] CPU0 [ 68.962732] ---- [ 68.965463] lock(fs_reclaim); [ 68.968963] <Interrupt> [ 68.971888] lock(fs_reclaim); [ 68.975585] *** DEADLOCK *** [ 68.982197] no locks held by ksoftirqd/56/299. [ 68.987157] stack backtrace: [ 68.992025] CPU: 56 PID: 299 Comm: ksoftirqd/56 Tainted: G E 5.3.0-rc2-oopschk+ #62 [ 69.002026] Hardware name: AMD Corporation Diesel/Diesel, BIOS RDL100BB 11/14/2018 [ 69.010480] Call Trace: [ 69.013212] dump_stack+0x8e/0xd5 [ 69.016915] print_usage_bug+0x1dd/0x1f0 [ 69.021298] mark_lock+0x454/0x4f0 [ 69.025094] ? find_held_lock+0x35/0xa0 [ 69.029376] ? check_usage_backwards+0x180/0x180 [ 69.034526] __lock_acquire+0x586/0x1470 [ 69.038909] ? trace_hardirqs_off+0x3b/0xe0 [ 69.043584] lock_acquire+0xa3/0x180 [ 69.047577] ? fs_reclaim_acquire.part.101+0x5/0x30 [ 69.053023] fs_reclaim_acquire.part.101+0x29/0x30 [ 69.058373] ? fs_reclaim_acquire.part.101+0x5/0x30 [ 69.063819] fs_reclaim_acquire+0x19/0x20 [ 69.068299] slab_pre_alloc_hook+0x19/0x70 [ 69.072875] __kmalloc+0x4e/0x270 [ 69.076578] ? skcipher_walk_next+0x1c9/0x570 [ 69.081444] skcipher_walk_next+0x1c9/0x570 [ 69.086115] skcipher_walk_first+0x46/0x110 [ 69.090788] skcipher_walk_skcipher+0xb7/0xd0 [ 69.095656] skcipher_walk_virt+0x2b/0x60 [ 69.100145] cbc_decrypt+0x4d/0xe0 [aesni_intel] [ 69.105310] crypto_skcipher_decrypt+0x1b/0x30 [ 69.110273] ? crypto_skcipher_decrypt+0x1b/0x30 [ 69.115431] simd_skcipher_decrypt+0xbe/0xd0 [crypto_simd] [ 69.121557] crypto_skcipher_decrypt+0x1b/0x30 [ 69.126525] crypto_authenc_decrypt_tail.isra.2+0xee/0x110 [ 69.132654] authenc_verify_ahash_done+0x2e/0x40 [ 69.137812] ccp_crypto_complete+0xde/0x190 [ 69.142488] ccp_do_cmd_complete+0x27/0x40 [ 69.147067] tasklet_action_common.isra.21+0x81/0x1a0 [ 69.152709] tasklet_action+0x22/0x30 [ 69.156796] __do_softirq+0xcd/0x3fe [ 69.160790] ? lockdep_hardirqs_on+0xfb/0x190 [ 69.165655] run_ksoftirqd+0x35/0x50 [ 69.169649] smpboot_thread_fn+0xfc/0x170 [ 69.174130] kthread+0x121/0x140 [ 69.177735] ? sort_range+0x30/0x30 [ 69.181630] ? kthread_park+0xb0/0xb0 [ 69.185720] ret_from_fork+0x27/0x50 [ 70.850617] alg: skcipher: failed to allocate transform for ecb(sm4): -2 [ 70.866302] alg: skcipher: failed to allocate transform for cbc(sm4): -2 [ 70.881646] alg: skcipher: failed to allocate transform for ctr(sm4): -2 [ 70.889161] tcrypt: one or more tests failed!