Hi Linus, On Fri, May 19, 2023 at 3:21 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, May 19, 2023 at 12:09 PM Joel Fernandes (Google) > <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > +static bool check_addr_in_prev(struct vm_area_struct *vma, unsigned long addr, > > + unsigned long mask) > > +{ > > + int addr_masked = addr & mask; > > + struct vm_area_struct *prev = NULL, *cur = NULL; > > + > > + /* If the masked address is within vma, there is no prev mapping of concern. */ > > + if (vma->vm_start <= addr_masked) > > + return false; > > Hmm. > > I should have caught this last time, but I didn't. > > That test smells bad to me. Or maybe it's just the comment. > > I *suspect* that the test is literally just for the stack movement > case by execve, where it catches the case where we're doing the > movement entirely within the one vma we set up. Yes that's right, the test is only for the stack movement case. For the regular mremap case, I don't think there is a way for it to trigger. > But in the *general* case I think the above is horribly wrong: if you > want to move pages within an existing mapping, the page moving code > can't just randomly say "I'll expand the area you wanted to move". > Again, in that general mremap() case (as opposed to the special stack > moving case for execve), I *think* that the caller has already split > the vma's at the point of the move, and this test simply cannot ever > trigger. Yes, the test simply cannot ever trigger for mremap() but we still need the test for the stack case. That's why I had considered adding it and I had indeed reviewed the stack case when adding the test. I could update the comment to mention that, if you want. > So I think the _code_ works, but I think the comment in particular is > questionable, and I'm a bit surprised about the code too,. because I > thought execve() only expanded to exactly the moving area. It expands to cover both the new start and the old end of the stack AFAICS: /* * cover the whole range: [new_start, old_end) */ if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL)) return -ENOMEM; In this case, it will trigger for the stack because this same expanded vma is passed as both the new vma and the old vma to move_page_tables(). */ if (length != move_page_tables(vma, old_start, vma, new_start, length, false)) return -ENOMEM; So AFAICS, it is possible that old_start will start later than this newly expanded VMA. So for such a situation, old_start can be realigned to PMD and the test allows that by saying it need not worry about aligning down to an existing VMA. > End result: I think the patch on the whole looks nice, and smaller > than I expected. I also suspect it works in practice, but I'd like > that test clarified. Does it *actually* trigger for the stack moving > case? Because I think it must (never* trigger for the mremap case? You are right that the test will not trigger for the mremap case. But from a correctness standpoint, I thought of leaving it there for the stack (and who knows for what other future reasons the test may be needed). I can update the comment describing the stack if you like. > And maybe I'm the one confused here, and all I really need is an > explanation with small words and simple grammar starting with "No, > Linus, this is for case xyz" Hopefully it is clear now and you agree. Let me know if you want me to do something more. I can make some time next week to trace the stack case a bit more if you like and report back on any behaviors, however the mremap tests I did are looking good and working as expected. thanks, - Joel