On Tue, Sep 19, 2023 at 1:53 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Mon 18-09-23 14:16:08, Suren Baghdasaryan wrote: > > When queue_pages_range() encounters an unmovable page, it terminates > > its page walk. This walk, among other things, locks the VMAs in the range. > > This termination might result in some VMAs being left unlock after > > queue_pages_range() completes. Since do_mbind() continues to operate on > > these VMAs despite the failure from queue_pages_range(), it will encounter > > an unlocked VMA. > > This mbind() behavior has been modified several times before and might > > need some changes to either finish the page walk even in the presence > > of unmovable pages or to error out immediately after the failure to > > queue_pages_range(). However that requires more discussions, so to > > fix the immediate issue, explicitly lock the VMAs in the range if > > queue_pages_range() failed. The added condition does not save much > > but is added for documentation purposes to understand when this extra > > locking is needed. > > The semantic of the walk in this case is really clear as mud. I was > trying to reconstruct the whole picture and it really hurts... Then I > found http://lkml.kernel.org/r/CAHbLzkrmTaqBRmHVdE2kyW57Uoghqd_E+jAXC9cB5ofkhL-uvw@xxxxxxxxxxxxxx > and that helped a lot. Let's keep it a reference at least in the email > thread here for future. FYI, I'm working on a fix for the regression mentioned in that series, and Hugh has some clean up and enhancement for that too. > > > Fixes: 49b0638502da ("mm: enable page walking API to lock vmas during the walk") > > Reported-by: syzbot+b591856e0f0139f83023@xxxxxxxxxxxxxxxxxxxxxxxxx > > Closes: https://lore.kernel.org/all/000000000000f392a60604a65085@xxxxxxxxxx/ > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > I cannot say I like the patch (it looks like a potential double locking > unless you realize this lock is special) but considering this might be just > temporal I do not mind. > > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > > Thanks! > > > --- > > mm/mempolicy.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index 42b5567e3773..cbc584e9b6ca 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -1342,6 +1342,9 @@ static long do_mbind(unsigned long start, unsigned long len, > > vma_iter_init(&vmi, mm, start); > > prev = vma_prev(&vmi); > > for_each_vma_range(vmi, vma, end) { > > + /* If queue_pages_range failed then not all VMAs might be locked */ > > + if (ret) > > + vma_start_write(vma); > > err = mbind_range(&vmi, vma, &prev, start, end, new); > > if (err) > > break; > > -- > > 2.42.0.459.ge4e396fd5e-goog > > -- > Michal Hocko > SUSE Labs