On Thu, May 11, 2023 at 11:08:41AM -0700, Mike Rapoport wrote: > (adding Peter) > > On Wed, May 10, 2023 at 09:26:10AM -0700, Lorenzo Stoakes wrote: > > On Wed, May 10, 2023 at 05:17:49PM +0100, Mark Rutland wrote: > > > On Wed, May 10, 2023 at 09:04:44AM -0700, Lorenzo Stoakes wrote: > > > > On Wed, May 10, 2023 at 03:15:51PM +0100, Mark Rutland wrote: > > > > > Hi, > > > > > > > > > > On Sun, Apr 30, 2023 at 09:19:17PM +0100, Lorenzo Stoakes wrote: > > > > > > We may still have inconsistent input parameters even if we choose not to > > > > > > merge and the vma_merge() invariant checks are useful for checking this > > > > > > with no production runtime cost (these are only relevant when > > > > > > CONFIG_DEBUG_VM is specified). > > > > > > > > > > > > Therefore, perform these checks regardless of whether we merge. > > > > > > > > > > > > This is relevant, as a recent issue (addressed in commit "mm/mempolicy: > > > > > > Correctly update prev when policy is equal on mbind") in the mbind logic > > > > > > was only picked up in the 6.2.y stable branch where these assertions are > > > > > > performed prior to determining mergeability. > > > > > > > > > > > > Had this remained the same in mainline this issue may have been picked up > > > > > > faster, so moving forward let's always check them. > > > > > > > > > > > > Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx> > > > > > > --- > > > > > > mm/mmap.c | 10 +++++----- > > > > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > > > > index 5522130ae606..13678edaa22c 100644 > > > > > > --- a/mm/mmap.c > > > > > > +++ b/mm/mmap.c > > > > > > @@ -960,17 +960,17 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, > > > > > > merge_next = true; > > > > > > } > > > > > > > > > > > > + /* Verify some invariant that must be enforced by the caller. */ > > > > > > + VM_WARN_ON(prev && addr <= prev->vm_start); > > > > > > + VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end)); > > > > > > + VM_WARN_ON(addr >= end); > > > > > > + > > > > > > > > > > I'm seeing this fire a lot when fuzzing v6.4-rc1 on arm64 using Syzkaller. > > > > > > > > > > > > > Thanks, from the line I suspect addr != curr->vm_start, but need to look > > > > into the repro, at lsf/mm so a bit time lagged :) > > > > > > No problem; FWIW I can confirm your theory, the reproducer is causing: > > > > > > addr > curr->vm_start > > > > > > ... confirmed the the following hack, log below. > > > > Awesome thanks for that! Just been firing up qemu to do this. > > > > Cases 5-8 should really have addr == curr->vm_start, I wonder if it's > > another case but curr is being set incorrectly, it should in theory not be > > the case. > > AFAIU, it's a case of "adjust vma, but don't merge, because prev is not > compatible". Looks like uffd first attempts to merge compatible the newly > registered range with adjacent vmas relying on that there won't be no merge > when addr != curr->vm_start and only after the merge attempt it splits the > edges. > > I think that moving the split in fs/userfaultfd.c:1495 (as of v6.4-rc1) > before vma_merge() will be the right fix. > Ack this was my strong suspicion, just want to get back to the UK and de-lagged before I dig in properly. > > (See [1] for a visualisation of merge cases as a handy reference) > > > > Of course userfaultfd might be the offender here and might be relying on no > > merge case arising but passing dodgy parameters. > > > > [1]:https://ljs.io/merge_cases.png > > You really should put it into Documentation/mm ;-) Well... will reply to your lsf/mm docs thread on this topic ;) > > > > > > > | diff --git a/mm/mmap.c b/mm/mmap.c > > > | index 13678edaa22c..2cdebba15719 100644 > > > | --- a/mm/mmap.c > > > | +++ b/mm/mmap.c > > > | @@ -961,9 +961,21 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, > > > | } > > > | > > > | /* Verify some invariant that must be enforced by the caller. */ > > > | - VM_WARN_ON(prev && addr <= prev->vm_start); > > > | - VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end)); > > > | - VM_WARN_ON(addr >= end); > > > | + VM_WARN(prev && addr <= prev->vm_start, > > > | + "addr = 0x%016lx, prev->vm_start = 0x%016lx\n", > > > | + addr, prev->vm_start); > > > | + > > > | + VM_WARN(curr && addr != curr->vm_start, > > > | + "addr = 0x%016lx, curr->vm_start = 0x%016lx\n", > > > | + addr, curr->vm_start); > > > | + > > > | + VM_WARN(curr && addr > curr->vm_end, > > > | + "addr = 0x%016lx, curr->vm_end = 0x%016lx\n", > > > | + addr, curr->vm_end); > > > | + > > > | + VM_WARN(addr >= end, > > > | + "addr = 0x%016lx, end = 0x%016lx\n", > > > | + addr, end); > > > | > > > | if (!merge_prev && !merge_next) > > > | return NULL; /* Not mergeable. */ > > > > > > ... with that applied, running the reproducer results in: > > > > > > | addr = 0x0000ffff99dc2000, curr->vm_start = 0x0000ffff99db2000 > > > | WARNING: CPU: 0 PID: 163 at mm/mmap.c:968 vma_merge+0x3d4/0x1260 > > > > > > ... i.e. addr > curr->vm_start > > > > > > Thanks, > > > Mark. > >