On Mon, Sep 19, 2022 at 02:03:15PM +0200, Vlastimil Babka wrote: > On 9/19/22 13:56, Hyeonggon Yoo wrote: > > On Mon, Sep 19, 2022 at 11:12:38AM +0200, Vlastimil Babka wrote: > >> On 9/19/22 05:12, Feng Tang wrote: > >> > When doing slub_debug test, kfence's 'test_memcache_typesafe_by_rcu' > >> > kunit test case cause a use-after-free error: > >> > > > > > If I'm not mistaken, I think the subject should be: > > s/double free/use after free/g > > Well, it's both AFAICS. By the initial use-after-free we can read a wrong > s->flags that was modified since we freed for the first time, and it can > lead to another kmem_cache_release() which is basically a double free. > Yeah, I realized that just after sending the mail ;) it is use-after-free bug that can potentially lead to double free. Thank you for correction! > >> > BUG: KASAN: use-after-free in kobject_del+0x14/0x30 > >> > Read of size 8 at addr ffff888007679090 by task kunit_try_catch/261 > >> > > >> > CPU: 1 PID: 261 Comm: kunit_try_catch Tainted: G B N 6.0.0-rc5-next-20220916 #17 > >> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 > >> > Call Trace: > >> > <TASK> > >> > dump_stack_lvl+0x34/0x48 > >> > print_address_description.constprop.0+0x87/0x2a5 > >> > print_report+0x103/0x1ed > >> > kasan_report+0xb7/0x140 > >> > kobject_del+0x14/0x30 > >> > kmem_cache_destroy+0x130/0x170 > >> > test_exit+0x1a/0x30 > >> > kunit_try_run_case+0xad/0xc0 > >> > kunit_generic_run_threadfn_adapter+0x26/0x50 > >> > kthread+0x17b/0x1b0 > >> > </TASK> > >> > > >> > The cause is inside kmem_cache_destroy(): > >> > > >> > kmem_cache_destroy > >> > acquire lock/mutex > >> > shutdown_cache > >> > schedule_work(kmem_cache_release) (if RCU flag set) > >> > release lock/mutex > >> > kmem_cache_release (if RCU flag set) > >> > >> ^ not set > >> > >> I've fixed that up. > >> > >> > > >> > in some certain timing, the scheduled work could be run before > >> > the next RCU flag checking which will get a wrong state. > >> > > >> > Fix it by caching the RCU flag inside protected area, just like 'refcnt' > > > > Very nice catch, thanks! > > > > Otherwise (and with Vlastimil's fix): > > > > Looks good to me. > > Reviewed-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> > > > >> > > >> > Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx> > >> > >> Thanks! > >> > >> > --- > >> > > >> > note: > >> > > >> > The error only happens on linux-next tree, and not in Linus' tree, > >> > which already has Waiman's commit: > >> > 0495e337b703 ("mm/slab_common: Deleting kobject in kmem_cache_destroy() > >> > without holding slab_mutex/cpu_hotplug_lock") > >> > >> Actually that commit is already in Linus' rc5 too, so I will send your fix > >> this week too. Added a Fixes: 0495e337b703 (...) too. > >> > >> > mm/slab_common.c | 5 ++++- > >> > 1 file changed, 4 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/mm/slab_common.c b/mm/slab_common.c > >> > index 07b948288f84..ccc02573588f 100644 > >> > --- a/mm/slab_common.c > >> > +++ b/mm/slab_common.c > >> > @@ -475,6 +475,7 @@ void slab_kmem_cache_release(struct kmem_cache *s) > >> > void kmem_cache_destroy(struct kmem_cache *s) > >> > { > >> > int refcnt; > >> > + bool rcu_set; > >> > > >> > if (unlikely(!s) || !kasan_check_byte(s)) > >> > return; > >> > @@ -482,6 +483,8 @@ void kmem_cache_destroy(struct kmem_cache *s) > >> > cpus_read_lock(); > >> > mutex_lock(&slab_mutex); > >> > > >> > + rcu_set = s->flags & SLAB_TYPESAFE_BY_RCU; > >> > + > >> > refcnt = --s->refcount; > >> > if (refcnt) > >> > goto out_unlock; > >> > @@ -492,7 +495,7 @@ void kmem_cache_destroy(struct kmem_cache *s) > >> > out_unlock: > >> > mutex_unlock(&slab_mutex); > >> > cpus_read_unlock(); > >> > - if (!refcnt && !(s->flags & SLAB_TYPESAFE_BY_RCU)) > >> > + if (!refcnt && !rcu_set) > >> > kmem_cache_release(s); > >> > } > >> > EXPORT_SYMBOL(kmem_cache_destroy); > >> > > > -- Thanks, Hyeonggon