Kirill A. Shutemov wrote: > Andrea Arcangeli wrote: > > Hi, > > > > On Mon, Dec 16, 2013 at 10:18:39AM -0500, Sasha Levin wrote: > > > On 12/16/2013 07:47 AM, Kirill A. Shutemov wrote: > > > > I probably miss some context here. Do you have crash on some use-case or > > > > what? Could you point me to start of discussion. > > > > > > Yes, Sorry, here's the crash that started this discussion originally: > > > > > > The code points to: > > > > > > > At this point pmd_none_or_trans_huge_or_clear_bad guaranteed us the > > pmd points to a regular pte. > > It took too long, but I finally found a way to reproduce the bug easily: > > #define _GNU_SOURCE > #include <sys/mman.h> > > #define MB (1024 * 1024) > > int main(int argc, char **argv) > { > void *p; > > p = mmap(0, 10 * MB, PROT_READ, > MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE, > -1, 0); > mprotect(p, 10 * MB, PROT_NONE); > madvise(p, 10 * MB, MADV_WILLNEED); > return 0; > } > > And I track it down to pmd_none_or_trans_huge_or_clear_bad(). > > It seems it doesn't guarantee to return 1 for pmd_trans_huge() page and I > don't know how it suppose to do this for non-bad page. > > I've fixed this with patch below. > > Andrea, do I miss something important here or > pmd_none_or_trans_huge_or_clear_bad() is broken from day 1? Oh.. It seems it cased by change pmd_bad() behaviour by commit be3a728427a6, so pmd_none_or_trans_huge_or_clear_bad() misses THP pmds if they are pmd_numa(). Other way to get it work is below. I'm not sure which is more correct (if any). Mel? Andrea? > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index f330d28e4d0e..0694c9bf2a34 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -599,7 +599,7 @@ static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd) > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > barrier(); > #endif > - if (pmd_none(pmdval)) > + if (pmd_none(pmdval) || pmd_trans_huge(pmdval)) > return 1; > if (unlikely(pmd_bad(pmdval))) { > if (!pmd_trans_huge(pmdval)) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index f330d28e4d0e..1f8bc7881bdb 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -558,6 +558,14 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp) } #endif +#ifndef pmd_numa +static inline int pmd_numa(pmd_t pmd) +{ + return (pmd_flags(pmd) & + (_PAGE_NUMA|_PAGE_PRESENT)) == _PAGE_NUMA; +} +#endif + /* * This function is meant to be used by sites walking pagetables with * the mmap_sem hold in read mode to protect against MADV_DONTNEED and @@ -601,7 +609,7 @@ static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd) #endif if (pmd_none(pmdval)) return 1; - if (unlikely(pmd_bad(pmdval))) { + if (unlikely(pmd_bad(pmdval) || pmd_numa(pmdval))) { if (!pmd_trans_huge(pmdval)) pmd_clear_bad(pmd); return 1; @@ -650,14 +658,6 @@ static inline int pte_numa(pte_t pte) } #endif -#ifndef pmd_numa -static inline int pmd_numa(pmd_t pmd) -{ - return (pmd_flags(pmd) & - (_PAGE_NUMA|_PAGE_PRESENT)) == _PAGE_NUMA; -} -#endif - /* * pte/pmd_mknuma sets the _PAGE_ACCESSED bitflag automatically * because they're called by the NUMA hinting minor page fault. If we -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>