Hi Andrew, This is a quick fixup to patch 1 of the "kasan: support backing vmalloc space with real shadow memory" series, v11, which you pulled in to your mmotm tree. There are 2 changes: - A fixup to the per-cpu allocator path to avoid allocating memory under a spinlock, thanks Qian Cai. - Insert flush_cache_vmap() between mapping shadow and poisoning it. This is a no-op on x86 and arm64, but on powerpc it does a ptesync instruction which prevents occasional page faults. Here are updated benchmark figures for the commit message: Testing with test_vmalloc.sh on an x86 VM with 2 vCPUs shows that: - Turning on KASAN, inline instrumentation, without vmalloc, introuduces a 5.7x-6.4x slowdown in vmalloc operations. - Turning this on introduces the following slowdowns over KASAN: * ~1.82x slower single-threaded (test_vmalloc.sh performance) * ~2.11x slower when both cpus are performing operations simultaneously (test_vmalloc.sh sequential_test_order=1) This is unfortunate, but given that this is a debug feature only, not the end of the world. The full results are: Performance No KASAN KASAN original x baseline KASAN vmalloc x baseline x KASAN fix_size_alloc_test 662004 11404956 17.23 19144610 28.92 1.68 full_fit_alloc_test 710950 12029752 16.92 13184651 18.55 1.10 long_busy_list_alloc_test 9431875 43990172 4.66 82970178 8.80 1.89 random_size_alloc_test 5033626 23061762 4.58 47158834 9.37 2.04 fix_align_alloc_test 1252514 15276910 12.20 31266116 24.96 2.05 random_size_align_alloc_te 1648501 14578321 8.84 25560052 15.51 1.75 align_shift_alloc_test 147 830 5.65 5692 38.72 6.86 pcpu_alloc_test 80732 125520 1.55 140864 1.74 1.12 Total Cycles 119240774314 763211341128 6.40 1390338696894 11.66 1.82 Sequential, 2 cpus No KASAN KASAN original x baseline KASAN vmalloc x baseline x KASAN fix_size_alloc_test 1423150 14276550 10.03 27733022 19.49 1.94 full_fit_alloc_test 1754219 14722640 8.39 15030786 8.57 1.02 long_busy_list_alloc_test 11451858 52154973 4.55 107016027 9.34 2.05 random_size_alloc_test 5989020 26735276 4.46 68885923 11.50 2.58 fix_align_alloc_test 2050976 20166900 9.83 50491675 24.62 2.50 random_size_align_alloc_te 2858229 17971700 6.29 38730225 13.55 2.16 align_shift_alloc_test 405 6428 15.87 26253 64.82 4.08 pcpu_alloc_test 127183 151464 1.19 216263 1.70 1.43 Total Cycles 54181269392 308723699764 5.70 650772566394 12.01 2.11 fix_size_alloc_test 1420404 14289308 10.06 27790035 19.56 1.94 full_fit_alloc_test 1736145 14806234 8.53 15274301 8.80 1.03 long_busy_list_alloc_test 11404638 52270785 4.58 107550254 9.43 2.06 random_size_alloc_test 6017006 26650625 4.43 68696127 11.42 2.58 fix_align_alloc_test 2045504 20280985 9.91 50414862 24.65 2.49 random_size_align_alloc_te 2845338 17931018 6.30 38510276 13.53 2.15 align_shift_alloc_test 472 3760 7.97 9656 20.46 2.57 pcpu_alloc_test 118643 132732 1.12 146504 1.23 1.10 Total Cycles 54040011688 309102805492 5.72 651325675652 12.05 2.11 Cc: Qian Cai <cai@xxxxxx> Signed-off-by: Daniel Axtens <dja@xxxxxxxxxx> --- mm/kasan/common.c | 2 ++ mm/vmalloc.c | 5 ++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/kasan/common.c b/mm/kasan/common.c index 6e7bc5d3fa83..df3371d5c572 100644 --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -794,6 +794,8 @@ int kasan_populate_vmalloc(unsigned long requested_size, struct vm_struct *area) if (ret) return ret; + flush_cache_vmap(shadow_start, shadow_end); + kasan_unpoison_shadow(area->addr, requested_size); area->flags |= VM_KASAN; diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a4b950a02d0b..bf030516258c 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC, pcpu_get_vm_areas); + } + spin_unlock(&vmap_area_lock); + /* populate the shadow space outside of the lock */ + for (area = 0; area < nr_vms; area++) { /* assume success here */ kasan_populate_vmalloc(sizes[area], vms[area]); } - spin_unlock(&vmap_area_lock); kfree(vas); return vms; -- 2.20.1