Hi! The purpose of the series is to allow KASAN to detect use-after-free access in SLAB_TYPESAFE_BY_RCU slab caches, by essentially making them behave as if the cache was not SLAB_TYPESAFE_BY_RCU but instead every kfree() in the cache was a kfree_rcu(). This is gated behind a config flag that is supposed to only be enabled in fuzzing/testing builds where the performance impact doesn't matter. Output of the new kunit testcase I added to the KASAN test suite: ================================================================== BUG: KASAN: slab-use-after-free in kmem_cache_rcu_uaf+0x3ae/0x4d0 Read of size 1 at addr ffff888106224000 by task kunit_try_catch/224 CPU: 7 PID: 224 Comm: kunit_try_catch Tainted: G B N 6.10.0-00003-g065427d4b87f #430 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 print_report+0xce/0x670 [...] kasan_report+0xa5/0xe0 [...] kmem_cache_rcu_uaf+0x3ae/0x4d0 [...] kunit_try_run_case+0x1b3/0x490 [...] kunit_generic_run_threadfn_adapter+0x80/0xe0 kthread+0x2a5/0x370 [...] ret_from_fork+0x34/0x70 [...] ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 224: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x6e/0x70 kmem_cache_alloc_noprof+0xef/0x2b0 kmem_cache_rcu_uaf+0x10d/0x4d0 kunit_try_run_case+0x1b3/0x490 kunit_generic_run_threadfn_adapter+0x80/0xe0 kthread+0x2a5/0x370 ret_from_fork+0x34/0x70 ret_from_fork_asm+0x1a/0x30 Freed by task 0: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x57/0x80 slab_free_after_rcu_debug+0xe3/0x220 rcu_core+0x676/0x15b0 handle_softirqs+0x22f/0x690 irq_exit_rcu+0x84/0xb0 sysvec_apic_timer_interrupt+0x6a/0x80 asm_sysvec_apic_timer_interrupt+0x1a/0x20 Last potentially related work creation: kasan_save_stack+0x33/0x60 __kasan_record_aux_stack+0x8e/0xa0 kmem_cache_free+0x10c/0x420 kmem_cache_rcu_uaf+0x16e/0x4d0 kunit_try_run_case+0x1b3/0x490 kunit_generic_run_threadfn_adapter+0x80/0xe0 kthread+0x2a5/0x370 ret_from_fork+0x34/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff888106224000 which belongs to the cache test_cache of size 200 The buggy address is located 0 bytes inside of freed 200-byte region [ffff888106224000, ffff8881062240c8) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x106224 head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x200000000000040(head|node=0|zone=2) page_type: 0xffffefff(slab) raw: 0200000000000040 ffff88810621c140 dead000000000122 0000000000000000 raw: 0000000000000000 00000000801f001f 00000001ffffefff 0000000000000000 head: 0200000000000040 ffff88810621c140 dead000000000122 0000000000000000 head: 0000000000000000 00000000801f001f 00000001ffffefff 0000000000000000 head: 0200000000000001 ffffea0004188901 ffffffffffffffff 0000000000000000 head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888106223f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888106223f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888106224000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888106224080: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc ffff888106224100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== ok 38 kmem_cache_rcu_uaf Signed-off-by: Jann Horn <jannh@xxxxxxxxxx> --- Changes in v8: - in patch 2/2: - move rcu_barrier() out of locked region (vbabka) - rearrange code in slab_free_after_rcu_debug (vbabka) - Link to v7: https://lore.kernel.org/r/20240808-kasan-tsbrcu-v7-0-0d0590c54ae6@xxxxxxxxxx Changes in v7: - in patch 2/2: - clarify kconfig comment (Marco) - fix memory leak (vbabka and dsterba) - move rcu_barrier() call up into kmem_cache_destroy() to hopefully make the merge conflict with vbabka's https://lore.kernel.org/all/20240807-b4-slab-kfree_rcu-destroy-v2-1-ea79102f428c@xxxxxxx/ easier to deal with - Link to v6: https://lore.kernel.org/r/20240802-kasan-tsbrcu-v6-0-60d86ea78416@xxxxxxxxxx Changes in v6: - in patch 1/2: - fix commit message (Andrey) - change comments (Andrey) - fix mempool handling of kfence objects (Andrey) - in patch 2/2: - fix is_kfence_address argument (syzbot and Marco) - refactor slab_free_hook() to create "still_accessible" variable - change kasan_slab_free() hook argument to "still_accessible" - add documentation to kasan_slab_free() hook - Link to v5: https://lore.kernel.org/r/20240730-kasan-tsbrcu-v5-0-48d3cbdfccc5@xxxxxxxxxx Changes in v5: - rebase to latest origin/master (akpm), no other changes from v4 - Link to v4: https://lore.kernel.org/r/20240729-kasan-tsbrcu-v4-0-57ec85ef80c6@xxxxxxxxxx Changes in v4: - note I kept vbabka's ack for the SLUB changes in patch 1/2 since the SLUB part didn't change, even though I refactored a bunch of the KASAN parts - in patch 1/2 (major rework): - fix commit message (Andrey) - add doc comments in header (Andrey) - remove "ip" argument from __kasan_slab_free() - rework the whole check_slab_free() thing and move code around (Andrey) - in patch 2/2: - kconfig description and dependency changes (Andrey) - remove useless linebreak (Andrey) - fix comment style (Andrey) - fix do_slab_free() invocation (kernel test robot) - Link to v3: https://lore.kernel.org/r/20240725-kasan-tsbrcu-v3-0-51c92f8f1101@xxxxxxxxxx Changes in v3: - in patch 1/2, integrate akpm's fix for !CONFIG_KASAN build failure - in patch 2/2, as suggested by vbabka, use dynamically allocated rcu_head to avoid having to add slab metadata - in patch 2/2, add a warning in the kconfig help text that objects can be recycled immediately under memory pressure - Link to v2: https://lore.kernel.org/r/20240724-kasan-tsbrcu-v2-0-45f898064468@xxxxxxxxxx Changes in v2: Patch 1/2 is new; it's some necessary prep work for the main patch to work, though the KASAN integration maybe is a bit ugly. Patch 2/2 is a rebased version of the old patch, with some changes to how the config is wired up, with poison/unpoison logic added as suggested by dvyukov@ back then, with cache destruction fixed using rcu_barrier() as pointed out by dvyukov@ and the test robot, and a test added as suggested by elver@. --- Jann Horn (2): kasan: catch invalid free before SLUB reinitializes the object slub: Introduce CONFIG_SLUB_RCU_DEBUG include/linux/kasan.h | 63 ++++++++++++++++++++++++++++++++++--- mm/Kconfig.debug | 32 +++++++++++++++++++ mm/kasan/common.c | 62 ++++++++++++++++++++++--------------- mm/kasan/kasan_test.c | 46 +++++++++++++++++++++++++++ mm/slab_common.c | 16 ++++++++++ mm/slub.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++----- 6 files changed, 267 insertions(+), 38 deletions(-) --- base-commit: 94ede2a3e9135764736221c080ac7c0ad993dc2d change-id: 20240723-kasan-tsbrcu-b715a901f776 -- Jann Horn <jannh@xxxxxxxxxx>