On 10/4/24 08:44, Marco Elver wrote: > On Wed, 2 Oct 2024 at 12:42, Vlastimil Babka <vbabka@xxxxxxx> wrote: >> >> On 9/11/24 08:45, Feng Tang wrote: >> > Danilo Krummrich's patch [1] raised one problem about krealloc() that >> > its caller doesn't pass the old request size, say the object is 64 >> > bytes kmalloc one, but caller originally only requested 48 bytes. Then >> > when krealloc() shrinks or grows in the same object, or allocate a new >> > bigger object, it lacks this 'original size' information to do accurate >> > data preserving or zeroing (when __GFP_ZERO is set). >> > >> > Thus with slub debug redzone and object tracking enabled, parts of the >> > object after krealloc() might contain redzone data instead of zeroes, >> > which is violating the __GFP_ZERO guarantees. Good thing is in this >> > case, kmalloc caches do have this 'orig_size' feature, which could be >> > used to improve the situation here. >> > >> > To make the 'orig_size' accurate, we adjust some kasan/slub meta data >> > handling. Also add a slub kunit test case for krealloc(). >> > >> > This patchset has dependency over patches in both -mm tree and -slab >> > trees, so it is written based on linux-next tree '20240910' version. >> > >> > [1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@xxxxxxxxxx/ >> >> Thanks, added to slab/for-next > > This series just hit -next, and we're seeing several "KFENCE: memory > corruption ...". Here's one: > https://lore.kernel.org/all/66ff8bf6.050a0220.49194.0453.GAE@xxxxxxxxxx/ > > One more (no link): > >> ================================================================== >> BUG: KFENCE: memory corruption in xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051 >> >> Corrupted memory at 0xffff88823bf5a0d0 [ 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ] (in kfence-#172): >> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051 >> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062 >> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145 >> rcu_do_batch kernel/rcu/tree.c:2567 [inline] > [...] >> >> kfence-#172: 0xffff88823bf5a000-0xffff88823bf5a0cf, size=208, cache=kmalloc-256 >> >> allocated by task 5494 on cpu 0 at 101.266046s (0.409225s ago): >> __do_krealloc mm/slub.c:4784 [inline] >> krealloc_noprof+0xd6/0x2e0 mm/slub.c:4838 >> xfs_iext_realloc_root fs/xfs/libxfs/xfs_iext_tree.c:613 [inline] > [...] >> >> freed by task 16 on cpu 0 at 101.573936s (0.186416s ago): >> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051 >> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062 >> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145 > [...] >> >> CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc1-next-20241003-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 >> ================================================================== > > Unfortunately there's no reproducer yet it seems. Unless it's > immediately obvious to say what's wrong, is it possible to take this > series out of -next to confirm this series is causing the memory > corruptions? Syzbot should then stop finding these crashes. I think it's commit d0a38fad51cc7 doing in __do_krealloc() - ks = ksize(p); + + s = virt_to_cache(p); + orig_size = get_orig_size(s, (void *)p); + ks = s->object_size; so for kfence objects we don't get their actual allocation size but the potentially larger bucket size? I guess we could do: ks = kfence_ksize(p) ?: s->object_size; ? > Thanks, > -- Marco