Re: mm: BUG in unmap_page_range

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 04, 2014 at 05:04:37AM -0400, Sasha Levin wrote:
> On 08/29/2014 09:23 PM, Sasha Levin wrote:
> > On 08/27/2014 11:26 AM, Mel Gorman wrote:
> >> > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> >> > index 281870f..ffea570 100644
> >> > --- a/include/asm-generic/pgtable.h
> >> > +++ b/include/asm-generic/pgtable.h
> >> > @@ -723,6 +723,9 @@ static inline pte_t pte_mknuma(pte_t pte)
> >> >  
> >> >  	VM_BUG_ON(!(val & _PAGE_PRESENT));
> >> >  
> >> > +	/* debugging only, specific to x86 */
> >> > +	VM_BUG_ON(val & _PAGE_PROTNONE);
> >> > +
> >> >  	val &= ~_PAGE_PRESENT;
> >> >  	val |= _PAGE_NUMA;
> > Triggered again, the first VM_BUG_ON got hit, the second one never did.
> 
> Okay, this bug has reproduced quite a few times since then that I no longer
> suspect it's random memory corruption. I'd be happy to try out more debug
> patches if you have any leads.
> 

The fact the second one doesn't trigger makes me think that this is not
related to how the helpers are called and is instead relating to timing.
I tried reproducing this but got nothing after 3 hours. How long does it
typically take to reproduce in a given run? You mentioned that it takes a
few weeks to hit but maybe the frequency has changed since. I tried todays
linux-next kernel but it didn't even boot so next-20140826 to match your
original report but got nothing. Can you also send me the config you used
in case that's a factor.

I had one hunch that this may somehow be related to a collision between
pagetable teardown during exit and the scanner but I could not find a
way that could actually happen. During teardown there should be only one
user of the mm and it can't race with itself.

A worse possibility is that somehow the lock is getting corrupted but
that's also a tough sell considering that the locks should be allocated
from a dedicated cache. I guess I could try breaking that to allocate
one page per lock so DEBUG_PAGEALLOC triggers but I'm not very
optimistic.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]