Re: mm: mmap_sem lock assertion failure in __mlock_vma_pages_range

Davidlohr Bueso <davidlohr@xxxxxx> · Tue, 11 Mar 2014 13:21:37 -0700

On Tue, 2014-03-11 at 16:12 -0400, Sasha Levin wrote:
> On 03/11/2014 04:07 PM, Davidlohr Bueso wrote:
> > On Tue, 2014-03-11 at 15:39 -0400, Sasha Levin wrote:
> >> Hi all,
> >>
> >> I've ended up deleting the log file by mistake, but this bug does seem to be important
> >> so I'd rather not wait before the same issue is triggered again.
> >>
> >> The call chain is:
> >>
> >> 	mlock (mm/mlock.c:745)
> >> 		__mm_populate (mm/mlock.c:700)
> >> 			__mlock_vma_pages_range (mm/mlock.c:229)
> >> 				VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
> >
> > So __mm_populate() is only called by mlock(2) and this VM_BUG_ON seems
> > wrong as we call it without the lock held:
> >
> > 	up_write(&current->mm->mmap_sem);
> > 	if (!error)
> > 		error = __mm_populate(start, len, 0);
> > 	return error;
> > }
> >
> >>
> >> It seems to be a rather simple trace triggered from userspace. The only recent patch
> >> in the area (that I've noticed) was "mm/mlock: prepare params outside critical region".
> >> I've reverted it and trying to testing without it.
> >
> > Odd, this patch should definitely *not* cause this. In any case every
> > operation removed from the critical region is local to the function:
> >
> > 	lock_limit = rlimit(RLIMIT_MEMLOCK);
> > 	lock_limit >>= PAGE_SHIFT;
> > 	locked = len >> PAGE_SHIFT;
> >
> > 	down_write(&current->mm->mmap_sem);
> 
> Yeah, this patch doesn't look like it's causing it, I guess it was more of a "you touched this
> code last - do you still remember what's going on here?" :).

How frequently do you trigger this issue? Could you verify if it still
occurs by reverting my patch?

> It's semi-odd because it seems like an obvious issue to hit with trinity but it's the first time
> I've seen it and it's probably been there for a while (that BUG_ON is there from 2009).

Actually that VM_BUG_ON is correct, because we do in fact take the
mmap_sem (for reading) inside __mm_populate(), which in return calls
__mlock_vma_pages_range() with the lock held. Now, the lock is taken
within the for loop, which does the hole "if (!locked) down_read()"
dance, but it's just making sure that we take the lock upon the first
iteration. So besides doing the locking outside of the loop, which is
just a cleanup, I don't really see how it could be triggered.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>