* Carlos Llamas <cmllamas@xxxxxxxxxx> [231102 15:00]: > The mmap read lock is used during the shrinker's callback, which means > that using alloc->vma pointer isn't safe as it can race with munmap(). I think you know my feelings about the safety of that pointer from previous discussions. > As of commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in > munmap") the mmap lock is downgraded after the vma has been isolated. > > I was able to reproduce this issue by manually adding some delays and > triggering page reclaiming through the shrinker's debug sysfs. The > following KASAN report confirms the UAF: > > ================================================================== > BUG: KASAN: slab-use-after-free in zap_page_range_single+0x470/0x4b8 > Read of size 8 at addr ffff356ed50e50f0 by task bash/478 > > CPU: 1 PID: 478 Comm: bash Not tainted 6.6.0-rc5-00055-g1c8b86a3799f-dirty #70 > Hardware name: linux,dummy-virt (DT) > Call trace: > zap_page_range_single+0x470/0x4b8 > binder_alloc_free_page+0x608/0xadc > __list_lru_walk_one+0x130/0x3b0 > list_lru_walk_node+0xc4/0x22c > binder_shrink_scan+0x108/0x1dc > shrinker_debugfs_scan_write+0x2b4/0x500 > full_proxy_write+0xd4/0x140 > vfs_write+0x1ac/0x758 > ksys_write+0xf0/0x1dc > __arm64_sys_write+0x6c/0x9c > > Allocated by task 492: > kmem_cache_alloc+0x130/0x368 > vm_area_alloc+0x2c/0x190 > mmap_region+0x258/0x18bc > do_mmap+0x694/0xa60 > vm_mmap_pgoff+0x170/0x29c > ksys_mmap_pgoff+0x290/0x3a0 > __arm64_sys_mmap+0xcc/0x144 > > Freed by task 491: > kmem_cache_free+0x17c/0x3c8 > vm_area_free_rcu_cb+0x74/0x98 > rcu_core+0xa38/0x26d4 > rcu_core_si+0x10/0x1c > __do_softirq+0x2fc/0xd24 > > Last potentially related work creation: > __call_rcu_common.constprop.0+0x6c/0xba0 > call_rcu+0x10/0x1c > vm_area_free+0x18/0x24 > remove_vma+0xe4/0x118 > do_vmi_align_munmap.isra.0+0x718/0xb5c > do_vmi_munmap+0xdc/0x1fc > __vm_munmap+0x10c/0x278 > __arm64_sys_munmap+0x58/0x7c > > Fix this issue by performing instead a vma_lookup() which will fail to > find the vma that was isolated before the mmap lock downgrade. Note that > this option has better performance than upgrading to a mmap write lock > which would increase contention. Plus, mmap_write_trylock() has been > recently removed anyway. > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > Cc: stable@xxxxxxxxxxxxxxx > Cc: Liam Howlett <liam.howlett@xxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Signed-off-by: Carlos Llamas <cmllamas@xxxxxxxxxx> > --- > drivers/android/binder_alloc.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c > index e3db8297095a..c4d60d81221b 100644 > --- a/drivers/android/binder_alloc.c > +++ b/drivers/android/binder_alloc.c > @@ -1005,7 +1005,9 @@ enum lru_status binder_alloc_free_page(struct list_head *item, > goto err_mmget; > if (!mmap_read_trylock(mm)) > goto err_mmap_read_lock_failed; > - vma = binder_alloc_get_vma(alloc); > + vma = vma_lookup(mm, page_addr); > + if (vma && vma != binder_alloc_get_vma(alloc)) > + goto err_invalid_vma; Doesn't this need to be: if (!vma || vma != binder_alloc_get_vma(alloc)) This way, we catch a different vma and a NULL vma. Or even, just: if (vma != binder_alloc_get_vma(alloc)) if the alloc vma cannot be NULL? > > list_lru_isolate(lru, item); > spin_unlock(lock); > @@ -1031,6 +1033,8 @@ enum lru_status binder_alloc_free_page(struct list_head *item, > mutex_unlock(&alloc->mutex); > return LRU_REMOVED_RETRY; > > +err_invalid_vma: > + mmap_read_unlock(mm); > err_mmap_read_lock_failed: > mmput_async(mm); > err_mmget: > -- > 2.42.0.869.gea05f2083d-goog >