Re: [syzbot] WARNING in hugetlb_change_protection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Mike,

Thanks for forwarding.

On Wed, Aug 03, 2022 at 10:02:37AM -0700, Mike Kravetz wrote:
> I'll start looking at this, but adding Peter this may be related to his
> recent changes.
> -- 
> Mike Kravetz
> 
> On 08/03/22 08:32, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    e65c6a46df94 Merge tag 'drm-fixes-2022-07-30' of git://ano..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=139cc282080000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=26034e6fe0075dad
> > dashboard link: https://syzkaller.appspot.com/bug?extid=824e71311e757a9689ff
> > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > userspace arch: i386
> > 
> > Unfortunately, I don't have any reproducer for this issue yet.
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+824e71311e757a9689ff@xxxxxxxxxxxxxxxxxxxxxxxxx
> > 
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 28745 at include/linux/swapops.h:319 make_pte_marker_entry include/linux/swapops.h:319 [inline]

This is the warning code I added to make sure pte marker won't be created
if not configured at all:

static inline swp_entry_t make_pte_marker_entry(pte_marker marker)
{
	/* This should never be called if !CONFIG_PTE_MARKER */
	WARN_ON_ONCE(1);
	return swp_entry(0, 0);
}

The stack below comes from a UFFDIO_WRITEPROTECT, however logically it
shouldn't really reach there - if with !PTE_MARKER then it must be with
!PTE_MARKER_UFFD_WP, then we should have returned "false" if hugetlb wanted
to register with uffd-wp:

static inline bool vma_can_userfault(struct vm_area_struct *vma,
				     unsigned long vm_flags)
{
	if (vm_flags & VM_UFFD_MINOR)
		return is_vm_hugetlb_page(vma) || vma_is_shmem(vma);

#ifndef CONFIG_PTE_MARKER_UFFD_WP
	/*
	 * If user requested uffd-wp but not enabled pte markers for
	 * uffd-wp, then shmem & hugetlbfs are not supported but only
	 * anonymous.
	 */
	if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
		return false;
#endif
	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
	    vma_is_shmem(vma);
}

Then the UFFDIO_WRITEPROTECT should have failed already.. at:

	if (!userfaultfd_wp(dst_vma))
		goto out_unlock;

in mwriteprotect_range().

I still have no idea how the bot managed to trigger a real wr-protect upon
this vma (which I don't think should have registered with uffd-wp but maybe
it can be worked around somehow..).  However to make this even safer we can
also guard the pte marker callers with CONFIG_PTE_MARKER_UFFD_WP. Patch
attached for that, but since this seems to have no reproducer yet maybe no
easy way to verify it.

At the meantime, I'd also like to double check the kernel config to make
sure CONFIG_PTE_MARKER_UFFD_WP will always be unset when CONFIG_PTE_MARKER=n.
Anyone knows where I can fetch the config somewhere?

Thanks,

> > WARNING: CPU: 1 PID: 28745 at include/linux/swapops.h:319 make_pte_marker include/linux/swapops.h:342 [inline]
> > WARNING: CPU: 1 PID: 28745 at include/linux/swapops.h:319 hugetlb_change_protection+0xf85/0x1610 mm/hugetlb.c:6392
> > Modules linked in:
> > CPU: 1 PID: 28745 Comm: syz-executor.3 Not tainted 5.19.0-rc8-syzkaller-00146-ge65c6a46df94 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
> > RIP: 0010:make_pte_marker_entry include/linux/swapops.h:319 [inline]
> > RIP: 0010:make_pte_marker include/linux/swapops.h:342 [inline]
> > RIP: 0010:hugetlb_change_protection+0xf85/0x1610 mm/hugetlb.c:6392
> > Code: e8 d0 5a b7 ff 0f b6 94 24 80 00 00 00 48 8b 84 24 98 00 00 00 84 d2 0f 84 ef 02 00 00 49 89 c4 e9 48 fb ff ff e8 ab 5e b7 ff <0f> 0b 48 b9 00 00 00 00 00 fc ff df 48 89 d8 48 c1 e8 03 80 3c 08
> > RSP: 0018:ffffc90014cc7780 EFLAGS: 00010212
> > RAX: 000000000000082a RBX: ffff88807750e820 RCX: ffffc90006723000
> > RDX: 0000000000040000 RSI: ffffffff81c30c25 RDI: 0000000000000007
> > RBP: ffff888074de5ea0 R08: 0000000000000007 R09: 0000000000000000
> > R10: 0000000000000004 R11: 0000000000000001 R12: 0000000000000000
> > R13: 0000000000000000 R14: 0000000000000004 R15: ffff88801fcc8e00
> > FS:  0000000000000000(0000) GS:ffff8880b9b00000(0063) knlGS:00000000f7f06b40
> > CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
> > CR2: 0000000020000040 CR3: 000000001b84c000 CR4: 00000000003526e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  change_protection+0x96b/0x3ad0 mm/mprotect.c:463
> >  mwriteprotect_range+0x387/0x5c0 mm/userfaultfd.c:759
> >  userfaultfd_writeprotect fs/userfaultfd.c:1823 [inline]
> >  userfaultfd_ioctl+0x438/0x4340 fs/userfaultfd.c:1997
> >  compat_ptr_ioctl+0x67/0x90 fs/ioctl.c:906
> >  __do_compat_sys_ioctl+0x1c7/0x290 fs/ioctl.c:968
> >  do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
> >  __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
> >  do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
> >  entry_SYSENTER_compat_after_hwframe+0x70/0x82
> > RIP: 0023:0xf7f0b549
> > Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
> > RSP: 002b:00000000f7f065cc EFLAGS: 00000296 ORIG_RAX: 0000000000000036
> > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000c018aa06
> > RDX: 00000000200000c0 RSI: 0000000000000000 RDI: 0000000000000000
> > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> >  </TASK>
> > ----------------
> > Code disassembly (best guess):
> >    0:	03 74 c0 01          	add    0x1(%rax,%rax,8),%esi
> >    4:	10 05 03 74 b8 01    	adc    %al,0x1b87403(%rip)        # 0x1b8740d
> >    a:	10 06                	adc    %al,(%rsi)
> >    c:	03 74 b4 01          	add    0x1(%rsp,%rsi,4),%esi
> >   10:	10 07                	adc    %al,(%rdi)
> >   12:	03 74 b0 01          	add    0x1(%rax,%rsi,4),%esi
> >   16:	10 08                	adc    %cl,(%rax)
> >   18:	03 74 d8 01          	add    0x1(%rax,%rbx,8),%esi
> >   1c:	00 00                	add    %al,(%rax)
> >   1e:	00 00                	add    %al,(%rax)
> >   20:	00 51 52             	add    %dl,0x52(%rcx)
> >   23:	55                   	push   %rbp
> >   24:	89 e5                	mov    %esp,%ebp
> >   26:	0f 34                	sysenter
> >   28:	cd 80                	int    $0x80
> > * 2a:	5d                   	pop    %rbp <-- trapping instruction
> >   2b:	5a                   	pop    %rdx
> >   2c:	59                   	pop    %rcx
> >   2d:	c3                   	retq
> >   2e:	90                   	nop
> >   2f:	90                   	nop
> >   30:	90                   	nop
> >   31:	90                   	nop
> >   32:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
> >   39:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
> > 
> > 
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.
> > 
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 

-- 
Peter Xu
>From 1be141a6accbb07e1c9bff665f3c5e147beea70f Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@xxxxxxxxxx>
Date: Wed, 3 Aug 2022 16:40:10 -0400
Subject: [PATCH] mm/uffd: Guard pte marker callers with PTE_MARKER_UFFD_WP
Content-type: text/plain

Logically no !PTE_MARKER user should be able to trigger make_pte_marker()
in any path, however to add extra guard with it put all pte marker code
into CONFIG_PTE_MARKER_UFFD_WP so they'll not be compiled in if not
configured.

Reported-by: syzbot+824e71311e757a9689ff@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
---
 mm/hugetlb.c  | 6 ++++++
 mm/mprotect.c | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a18c071c294e..e632cdf1e3f4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5049,6 +5049,7 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
 		 * unmapped and its refcount is dropped, so just clear pte here.
 		 */
 		if (unlikely(!pte_present(pte))) {
+#ifdef CONFIG_PTE_MARKER_UFFD_WP
 			/*
 			 * If the pte was wr-protected by uffd-wp in any of the
 			 * swap forms, meanwhile the caller does not want to
@@ -5060,6 +5061,7 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
 				set_huge_pte_at(mm, address, ptep,
 						make_pte_marker(PTE_MARKER_UFFD_WP));
 			else
+#endif
 				huge_pte_clear(mm, address, ptep, sz);
 			spin_unlock(ptl);
 			continue;
@@ -5088,11 +5090,13 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
 		tlb_remove_huge_tlb_entry(h, tlb, ptep, address);
 		if (huge_pte_dirty(pte))
 			set_page_dirty(page);
+#ifdef CONFIG_PTE_MARKER_UFFD_WP
 		/* Leave a uffd-wp pte marker if needed */
 		if (huge_pte_uffd_wp(pte) &&
 		    !(zap_flags & ZAP_FLAG_DROP_MARKER))
 			set_huge_pte_at(mm, address, ptep,
 					make_pte_marker(PTE_MARKER_UFFD_WP));
+#endif
 		hugetlb_count_sub(pages_per_huge_page(h), mm);
 		page_remove_rmap(page, vma, true);
 
@@ -6387,10 +6391,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 			pages++;
 		} else {
 			/* None pte */
+#ifdef CONFIG_PTE_MARKER_UFFD_WP
 			if (unlikely(uffd_wp))
 				/* Safe to modify directly (none->non-present). */
 				set_huge_pte_at(mm, address, ptep,
 						make_pte_marker(PTE_MARKER_UFFD_WP));
+#endif
 		}
 		spin_unlock(ptl);
 	}
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ba5592655ee3..85ef55a74d6e 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -221,6 +221,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb,
 		} else {
 			/* It must be an none page, or what else?.. */
 			WARN_ON_ONCE(!pte_none(oldpte));
+#ifdef CONFIG_PTE_MARKER_UFFD_WP
 			if (unlikely(uffd_wp && !vma_is_anonymous(vma))) {
 				/*
 				 * For file-backed mem, we need to be able to
@@ -232,6 +233,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb,
 					   make_pte_marker(PTE_MARKER_UFFD_WP));
 				pages++;
 			}
+#endif
 		}
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 	arch_leave_lazy_mmu_mode();
-- 
2.32.0


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux