* Peter Xu <peterx@xxxxxxxxxx> [230517 09:50]: ... > > > > > > > > I don't think this is safe. You are telling vma_merge() something that > > > > is not true and will result in can_vma_merge_before() passing. I mean, > > > > sure it will become true after you split (unless you can't?), but I > > > > don't know if you can just merge a VMA that doesn't pass > > > > can_vma_merge_before(), even for a short period? > > > > > > I must admit I'm not really that handy yet to vma codes, so I could miss > > > something obvious. > > > > > > My reasoning comes from two parts that this pgoff looks all fine: > > > > > > 1) It's documented in vma_merge() in that: > > > > > > * Given a mapping request (addr,end,vm_flags,file,pgoff,anon_name), > > > * figure out ... > > > > > > So fundamentally this pgoff is part of the mapping request paired with > > > all the rest of the information. AFAICT it means it must match with what > > > "addr" is describing in VA address space. That's why I think offseting > > > it makes sense here. > > > > > > It also matches my understanding in vma_merge() code on how the pgoff is > > > used. > > > > > > 2) Uffd is nothing special in this regard, namely: > > > > > > mbind_range(): > > > > > > pgoff = vma->vm_pgoff + ((vmstart - vma->vm_start) >> PAGE_SHIFT); > > > merged = vma_merge(vmi, vma->vm_mm, *prev, vmstart, vmend, vma->vm_flags, > > > vma->anon_vma, vma->vm_file, pgoff, new_pol, > > > vma->vm_userfaultfd_ctx, anon_vma_name(vma)); > > > > > > mlock_fixup(): > > > > > > pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > > > *prev = vma_merge(vmi, mm, *prev, start, end, newflags, > > > vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), > > > vma->vm_userfaultfd_ctx, anon_vma_name(vma)); > > > > > > mprotect_fixup(): > > > > > > pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > > > *pprev = vma_merge(vmi, mm, *pprev, start, end, newflags, > > > vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), > > > vma->vm_userfaultfd_ctx, anon_vma_name(vma)); > > > > > > I had a feeling that it's just something overlooked in the initial proposal > > > of uffd, but maybe I missed something important? > > > > I think you are correct. It's worth noting that all of these skip > > splitting if merging succeeds. > > Yes, IIUC that's what we want because vma_merge() just handles everything > there (including split, or say, vma range adjustments) if any !NULL > returned. I don't get your use of split here. __vma_adjust() used to be used by split, but it never split a VMA. vma_merge() is not used by split at all. > > > > > We know it won't match case 1-4 (we have a current vma). We then pass > > in vma_end = min(end, vma->vm_end); > > Case 4 seems still possible and should be the case that mentioned in the > patch 2, iiuc. But yes I think vma_end calculation is needed, afaik it is > to cover the last iteration, where that's the only place possible that we > may operate on "end" (where < vma->vm_end) rather than "vma->vm_end". It > actually pairs with the initial "start" adjustment to me. > > > > > vma_lookup() will only be called if end == vma->vm_end, so next will not > > be set (and found) unless it is adjacent to the current vma and the vma > > in question does not need to be split anyways. > > > > I also see that we use pgoff+pglen in the check, which avoids my concern > > above. > > Right. > > It seems so far all concerns are more or less ruled out. I'll prepare a > formal patchset, we can continue the discussion there. > > Thanks, > > -- > Peter Xu >