On 11.07.24 07:13, Pei Li wrote:
This patch fixes this warning by acquiring read lock before entering
untrack_pfn() while write lock is not held.
syzbot has tested the proposed patch and the reproducer did not
trigger any issue.
Reported-by: syzbot+35a4414f6e247f515443@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
Tested-by: syzbot+35a4414f6e247f515443@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Pei Li <peili.dev@xxxxxxxxx>
---
Syzbot reported the following warning in follow_pte():
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 rwsem_assert_held include/linux/rwsem.h:195 [inline]
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 mmap_assert_locked include/linux/mmap_lock.h:65 [inline]
WARNING: CPU: 3 PID: 5192 at include/linux/rwsem.h:195 follow_pte+0x414/0x4c0 mm/memory.c:5980
This is because we are assuming that mm->mmap_lock should be held when
entering follow_pte(). This is added in commit c5541ba378e3 (mm:
follow_pte() improvements).
However, in the following call stack, we are not acquring the lock:
follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
zap_page_range_single+0x326/0x560 mm/memory.c:1920
That implies that unmap_vmas() is called without the mmap lock in read
mode, correct?
Do we know how this happens?
* exit_mmap() holds the mmap lock in read mode
* unmap_region is documented to hold the mmap lock in read mode
In zap_page_range_single(), we passed mm_wr_locked as false, as we do
not expect write lock to be held.
In the special case where vma->vm_flags is set as VM_PFNMAP, we are
hitting untrack_pfn() which eventually calls into follow_phys.
This patch fixes this warning by acquiring read lock before entering
untrack_pfn() while write lock is not held.
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Tested on:
commit: 9d9a2f29 Merge tag 'mm-hotfixes-stable-2024-07-10-13-1..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13be8021980000
kernel config: https://syzkaller.appspot.com/x/.config?x=3456bae478301dc8
dashboard link: https://syzkaller.appspot.com/bug?extid=35a4414f6e247f515443
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=145e3441980000
Note: testing is done by a robot and is best-effort only.
---
mm/memory.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index d10e616d7389..75d7959b835b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1815,9 +1815,16 @@ static void unmap_single_vma(struct mmu_gather *tlb,
if (vma->vm_file)
uprobe_munmap(vma, start, end);
- if (unlikely(vma->vm_flags & VM_PFNMAP))
+ if (unlikely(vma->vm_flags & VM_PFNMAP)) {
+ if (!mm_wr_locked)
+ mmap_read_lock(vma->vm_mm);
+
untrack_pfn(vma, 0, 0, mm_wr_locked);
+ if (!mm_wr_locked)
+ mmap_read_unlock(vma->vm_mm);
+ }
+
if (start != end) {
if (unlikely(is_vm_hugetlb_page(vma))) {
I'm not sure if this is the right fix. I like to understand how we end
up without the mmap lock at least in read mode in that path?
--
Cheers,
David / dhildenb