On Fri, Jul 12, 2024 at 4:08 AM mawupeng <mawupeng1@xxxxxxxxxx> wrote: > > Hi maintainers, > > kingly ping. > > On 2024/6/18 14:40, Wupeng Ma wrote: > > Hi maintainers, > > > > During our testing, we discovered that kasan vmalloc may trigger a false > > vmalloc-out-of-bounds warning due to a race between kasan_populate_vmalloc_pte > > and kasan_depopulate_vmalloc_pte. > > > > cpu0 cpu1 cpu2 > > kasan_populate_vmalloc_pte kasan_populate_vmalloc_pte kasan_depopulate_vmalloc_pte > > spin_unlock(&init_mm.page_table_lock); > > pte_none(ptep_get(ptep)) > > // pte is valid here, return here > > pte_clear(&init_mm, addr, ptep); > > pte_none(ptep_get(ptep)) > > // pte is none here try alloc new pages > > spin_lock(&init_mm.page_table_lock); > > kasan_poison > > // memset kasan shadow region to 0 > > page = __get_free_page(GFP_KERNEL); > > __memset((void *)page, KASAN_VMALLOC_INVALID, PAGE_SIZE); > > pte = pfn_pte(PFN_DOWN(__pa(page)), PAGE_KERNEL); > > spin_lock(&init_mm.page_table_lock); > > set_pte_at(&init_mm, addr, ptep, pte); > > spin_unlock(&init_mm.page_table_lock); > > > > > > Since kasan shadow memory in cpu0 is set to 0xf0 which means it is not > > initialized after the race in cpu1. Consequently, a false vmalloc-out-of-bounds > > warning is triggered when a user attempts to access this memory region. > > > > The root cause of this problem is the pte valid check at the start of > > kasan_populate_vmalloc_pte should be removed since it is not protected by > > page_table_lock. However, this may result in severe performance degradation > > since pages will be frequently allocated and freed. > > > > Is there have any thoughts on how to solve this issue? > > > > Thank you. I am going to take a closer look at this issue. Any chance you have a reproducer for it?