On Thu, Apr 25, 2024 at 1:01 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > On Thu, Apr 25, 2024 at 08:39:37AM -0700, Suren Baghdasaryan wrote: > > On Wed, Apr 24, 2024 at 8:26 PM Kent Overstreet > > <kent.overstreet@xxxxxxxxx> wrote: > > > > > > On Wed, Apr 24, 2024 at 06:59:01PM -0700, Kees Cook wrote: > > > > On Thu, Mar 21, 2024 at 09:36:22AM -0700, Suren Baghdasaryan wrote: > > > > > Low overhead [1] per-callsite memory allocation profiling. Not just for > > > > > debug kernels, overhead low enough to be deployed in production. > > > > > > > > Okay, I think I'm holding it wrong. With next-20240424 if I set: > > > > > > > > CONFIG_CODE_TAGGING=y > > > > CONFIG_MEM_ALLOC_PROFILING=y > > > > CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y > > > > > > > > My test system totally freaks out: > > > > > > > > ... > > > > SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 > > > > Oops: general protection fault, probably for non-canonical address 0xc388d881e4808550: 0000 [#1] PREEMPT SMP NOPTI > > > > CPU: 0 PID: 0 Comm: swapper Not tainted 6.9.0-rc5-next-20240424 #1 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 > > > > RIP: 0010:__kmalloc_node_noprof+0xcd/0x560 > > > > > > > > Which is: > > > > > > > > __kmalloc_node_noprof+0xcd/0x560: > > > > __slab_alloc_node at mm/slub.c:3780 (discriminator 2) > > > > (inlined by) slab_alloc_node at mm/slub.c:3982 (discriminator 2) > > > > (inlined by) __do_kmalloc_node at mm/slub.c:4114 (discriminator 2) > > > > (inlined by) __kmalloc_node_noprof at mm/slub.c:4122 (discriminator 2) > > > > > > > > Which is: > > > > > > > > tid = READ_ONCE(c->tid); > > > > > > > > I haven't gotten any further than that; I'm EOD. Anyone seen anything > > > > like this with this series? > > > > > > I certainly haven't. That looks like some real corruption, we're in slub > > > internal data structures and derefing a garbage address. Check kasan and > > > all that? > > > > Hi Kees, > > I tested next-20240424 yesterday with defconfig and > > CONFIG_MEM_ALLOC_PROFILING enabled but didn't see any issue like that. > > Could you share your config file please? > > Well *that* took a while to .config bisect. I probably should have found > it sooner, but CONFIG_DEBUG_KMEMLEAK=y is what broke me. Without that, > everything is lovely! :) > > I can reproduce it now with: > > $ make defconfig kvm_guest.config > $ ./scripts/config -e CONFIG_MEM_ALLOC_PROFILING -e CONFIG_DEBUG_KMEMLEAK Thanks! I'll use this to reproduce the issue and will see if we can handle that recursion in a better way. > > -Kees > > -- > Kees Cook