On Sat, Jan 18, 2025 at 9:06 AM Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > On Tue, Dec 03, 2024 at 10:10:13AM -0500, Bartosz Golaszewski wrote: > > On Tue, 3 Dec 2024 14:53:21 +0100, neil.armstrong@xxxxxxxxxx said: > > > On 03/12/2024 10:19, Bartosz Golaszewski wrote: > > >> From: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx> > > >> > > >> Having switched to workqueue from tasklet, we are no longer limited to > > >> atomic APIs and can now convert the spinlock to a mutex. This, along > > >> with the conversion from tasklet to workqueue grants us ~15% improvement > > >> in cryptsetup benchmarks for AES encryption. > > > > > > Can you share on which platforms you did the tests and the results you got ? > > > > > > > Sure, I tested on sm8650 with the following results (they vary from > > one run to other but are more or less in this range): > > FYI, when I test this driver on sm8650 with linux-next there are lots of test > failures, including ones where the driver straight out returns wrong hashes: > > [ 6.330656] alg: skcipher: xts-aes-qce setkey failed on test vector 0; expected_error=0, actual_error=-126, flags=0x1 > [ 6.347196] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126) > [ 6.347200] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126) > [ 6.784951] alg: skcipher: ctr-aes-qce encryption test failed (wrong output IV) on test vector 4, cfg="in-place (one sglist)" > [ 6.810709] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22) > [ 6.941639] alg: skcipher: cbc-3des-qce setkey failed on test vector "random: len=16 klen=24"; expected_error=0, actual_error=-126, flags=0x1 > [ 6.947029] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22) > [ 6.954394] alg: self-tests for cbc(des3_ede) using cbc-3des-qce failed (rc=-126) > [ 6.975348] alg: self-tests for cbc(des3_ede) using cbc-3des-qce failed (rc=-126) > [ 7.454433] alg: skcipher: ecb-3des-qce setkey failed on test vector "random: len=32 klen=24"; expected_error=0, actual_error=-126, flags=0x1 > [ 7.454482] alg: self-tests for ecb(des3_ede) using ecb-3des-qce failed (rc=-126) > [ 7.454493] alg: self-tests for ecb(des3_ede) using ecb-3des-qce failed (rc=-126) > [ 8.593828] alg: ahash: hmac-sha256-qce test failed (wrong result) on test vector "random: psize=0 ksize=80", cfg="random: may_sleep use_finup src_divs=[<reimport>100.0%@+1800] key_offset=7" > [ 8.627337] alg: self-tests for hmac(sha256) using hmac-sha256-qce failed (rc=-22) > [ 8.639889] alg: self-tests for hmac(sha256) using hmac-sha256-qce failed (rc=-22) > [ 8.933595] alg: ahash: hmac-sha1-qce test failed (wrong result) on test vector "random: psize=0 ksize=56", cfg="random: inplace_one_sglist use_finup nosimd_setkey src_divs=[100.0%@+3969] key_offset=71" > [ 8.952885] alg: self-tests for hmac(sha1) using hmac-sha1-qce failed (rc=-22) > [ 8.965096] alg: self-tests for hmac(sha1) using hmac-sha1-qce failed (rc=-22) > I was testing with kcapi-speed and cryptsetup benchmark. I've never seen any errors. Is this after my changes only or did it exist before? You're testing with the tcrypt module? How are you inserting it exactly? What params? > Also a KASAN error: > > [ 9.420862] CPU: 3 UID: 0 PID: 393 Comm: cryptomgr_test Tainted: G S W 6.13.0-rc7-next-20250117-00007-g58182eb6d73d #12 > [ 9.420881] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN > [ 9.420886] Hardware name: Qualcomm Technologies, Inc. SM8650 HDK (DT) > [ 9.420891] Call trace: > [ 9.420895] show_stack+0x18/0x24 (C) > [ 9.420918] dump_stack_lvl+0x60/0x80 > [ 9.420937] print_report+0x17c/0x4d0 > [ 9.420960] kasan_report+0xb0/0xf8 > [ 9.420983] kasan_check_range+0x100/0x1a8 > [ 9.421001] __asan_memcpy+0x3c/0xa0 > [ 9.421020] swiotlb_bounce+0x1b4/0x340 > [ 9.421034] swiotlb_tbl_map_single+0x2ac/0x5a0 > [ 9.421050] iommu_dma_map_page+0x2b8/0x4cc > [ 9.421063] iommu_dma_map_sg+0x2e8/0xb3c > [ 9.421072] __dma_map_sg_attrs+0x11c/0x1b0 > [ 9.421086] dma_map_sg_attrs+0x10/0x20 > [ 9.421097] qce_aead_async_req_handle+0x784/0x1aa0 > [ 9.421125] qce_handle_queue+0x20c/0x3e8 > [ 9.421139] qce_async_request_enqueue+0x10/0x1c > [ 9.421153] qce_aead_crypt+0x1f0/0x81c > [ 9.421168] qce_aead_encrypt+0x14/0x20 > [ 9.421182] crypto_aead_encrypt+0xa0/0xe0 > [ 9.421194] test_aead_vec_cfg+0x84c/0x1be8 > [ 9.421207] test_aead_vs_generic_impl+0x530/0x850 > [ 9.421219] alg_test_aead+0x7c8/0xe40 > [ 9.421230] alg_test+0x1f4/0xb0c > [ 9.421241] cryptomgr_test+0x50/0x80 > [ 9.421251] kthread+0x378/0x678 > [ 9.421272] ret_from_fork+0x10/0x20 > > [ 9.550765] Allocated by task 393: > [ 9.554279] kasan_save_stack+0x3c/0x64 > [ 9.558252] kasan_save_track+0x20/0x40 > [ 9.562221] kasan_save_alloc_info+0x40/0x54 > [ 9.566641] __kasan_kmalloc+0xb8/0xbc > [ 9.570515] __kmalloc_noprof+0x188/0x410 > [ 9.574669] qce_aead_async_req_handle+0xbd4/0x1aa0 > [ 9.579701] qce_handle_queue+0x20c/0x3e8 > [ 9.583841] qce_async_request_enqueue+0x10/0x1c > [ 9.588607] qce_aead_crypt+0x1f0/0x81c > [ 9.592577] qce_aead_encrypt+0x14/0x20 > [ 9.596547] crypto_aead_encrypt+0xa0/0xe0 > [ 9.600776] test_aead_vec_cfg+0x84c/0x1be8 > [ 9.605097] test_aead_vs_generic_impl+0x530/0x850 > [ 9.610039] alg_test_aead+0x7c8/0xe40 > [ 9.613911] alg_test+0x1f4/0xb0c > [ 9.617345] cryptomgr_test+0x50/0x80 > [ 9.621124] kthread+0x378/0x678 > [ 9.624473] ret_from_fork+0x10/0x20 > > [ 9.629738] The buggy address belongs to the object at ffff5c7440d1cc00 > which belongs to the cache kmalloc-32 of size 32 > [ 9.642426] The buggy address is located 0 bytes inside of > allocated 22-byte region [ffff5c7440d1cc00, ffff5c7440d1cc16) > > [ 9.656672] The buggy address belongs to the physical page: > [ 9.662409] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x880d1c > [ 9.670646] flags: 0xbfffe0000000000(node=0|zone=2|lastcpupid=0x1ffff) > [ 9.677371] page_type: f5(slab) > [ 9.680624] raw: 0bfffe0000000000 ffff5c7440002780 dead000000000100 dead000000000122 > [ 9.688595] raw: 0000000000000000 0000000080400040 00000000f5000000 0000000000000000 > [ 9.696560] page dumped because: kasan: bad access detected > > [ 9.703854] Memory state around the buggy address: > [ 9.708796] ffff5c7440d1cb00: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc > [ 9.716227] ffff5c7440d1cb80: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc > [ 9.723660] >ffff5c7440d1cc00: 00 00 06 fc fc fc fc fc fa fb fb fb fc fc fc fc > [ 9.731092] ^ > [ 9.734959] ffff5c7440d1cc80: fa fb fb fb fc fc fc fc 00 00 06 fc fc fc fc fc > [ 9.742392] ffff5c7440d1cd00: 00 00 06 fc fc fc fc fc fa fb fb fb fc fc fc fc > [ 9.749824] ================================================================== > Hmm, I am running with KASAN on, never seen this How would I run tcrypt to trigger it? By default the module doesn't load with no params and gives me: modprobe: ERROR: could not insert 'tcrypt': Resource temporarily unavailable > > I personally still struggle to understand how this driver could plausibly be > useful when the software crypto has no issues, is much faster, and is much > better tested. What is motivating having this driver in the kernel? We want to use it in conjunction with the upcoming scminvoke (for loading TAs and invoking objects - used to program the keys into the QCE) to support the DRM use-case for decrypting streaming data inside secure buffers upstream. Bart