On Wed, 28 Jul 2021 at 13:05, Kuan-Ying Lee <Kuan-Ying.Lee@xxxxxxxxxxxx> wrote: > > On Tue, 2021-07-27 at 20:22 +0100, Catalin Marinas wrote: > > On Tue, Jul 27, 2021 at 04:32:02PM +0800, Kuan-Ying Lee wrote: > > > On Tue, 2021-07-27 at 09:10 +0200, Marco Elver wrote: > > > > +Cc Catalin > > > > > > > > On Tue, 27 Jul 2021 at 06:00, Kuan-Ying Lee < > > > > Kuan-Ying.Lee@xxxxxxxxxxxx> wrote: > > > > > > > > > > Hardware tag-based KASAN doesn't use compiler instrumentation, > > > > > we > > > > > can not use kasan_disable_current() to ignore tag check. > > > > > > > > > > Thus, we need to reset tags when accessing metadata. > > > > > > > > > > Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@xxxxxxxxxxxx> > > > > > > > > This looks reasonable, but the patch title is not saying this is > > > > kmemleak, nor does the description say what the problem is. What > > > > problem did you encounter? Was it a false positive? > > > > > > kmemleak would scan kernel memory to check memory leak. > > > When it scans on the invalid slab and dereference, the issue > > > will occur like below. > > > > > > So I think we should reset the tag before scanning. > > > > > > # echo scan > /sys/kernel/debug/kmemleak > > > [ 151.905804] > > > ================================================================== > > > [ 151.907120] BUG: KASAN: out-of-bounds in scan_block+0x58/0x170 > > > [ 151.908773] Read at addr f7ff0000c0074eb0 by task kmemleak/138 > > > [ 151.909656] Pointer tag: [f7], memory tag: [fe] > > > > It would be interesting to find out why the tag doesn't match. > > Kmemleak > > should in principle only scan valid objects that have been allocated > > and > > the pointer can be safely dereferenced. 0xfe is KASAN_TAG_INVALID, so > > it > > either goes past the size of the object (into the red zone) or it > > still > > accesses the object after it was marked as freed but before being > > released from kmemleak. > > > > With slab, looking at __cache_free(), it calls kasan_slab_free() > > before > > ___cache_free() -> kmemleak_free_recursive(), so the second scenario > > is > > possible. With slub, however, slab_free_hook() first releases the > > object > > from kmemleak before poisoning it. Based on the stack dump, you are > > using slub, so it may be that kmemleak goes into the object red > > zones. > > > > I'd like this clarified before blindly resetting the tag. > > This kasan issue only happened on hardware tag-based kasan mode. > Because kasan_disable_current() works for generic and sw tag-based > kasan. > > HW tag-based kasan depends on slub so slab will not hit this > issue. > I think we can just check if HW tag-based kasan is enabled or not > and decide to reset the tag as below. > > if (kasan_has_integrated_init()) // slub case, hw-tag kasan > pointer = *(unsigned long *)kasan_reset_tag((void *)ptr); > else > pointer = *ptr; // slab This is redundant. kasan_reset_tag() is a noop if !IS_ENABLED(CONFIG_KASAN_HW_TAGS). > Is this better or any other suggestions? > Any suggestion is appreciated. The current version is fine. But I think Catalin's point about why kmemleak accesses the data in the first place still deserves some investigation. Could it be a race between free and kmemleak scan?