On Tue, 25 Aug 2020 at 10:32, Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote: > > On Tue, Aug 25, 2020 at 01:03:53PM +0530, Naresh Kamboju wrote: > > On Mon, 24 Aug 2020 at 16:36, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > On Mon, Aug 24, 2020 at 03:14:55PM +0530, Naresh Kamboju wrote: > > > > [ 67.545247] BUG: Bad page state in process true pfn:a8fed > > > > [ 67.550767] page:9640c0ab refcount:0 mapcount:-1024 > > > > > > Somebody freed a page table without calling __ClearPageTable() on it. > > > > After running git bisect on this problem, > > The first suspecting of this problem on arm architecture this patch. > > 424efe723f7717430bec7c93b4d28bba73e31cf6 > > ("mm: account PMD tables like PTE tables ") > > > > Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> > > Reported-by: Anders Roxell <anders.roxell@xxxxxxxxxx> > > Can you please check if this fix helps? That fixed the problem. Cheers, Anders > > diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h > index 9415222b49ad..b8cbe03ad260 100644 > --- a/arch/arm/include/asm/tlb.h > +++ b/arch/arm/include/asm/tlb.h > @@ -59,6 +59,7 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr) > #ifdef CONFIG_ARM_LPAE > struct page *page = virt_to_page(pmdp); > > + pgtable_pmd_page_dtor(page); > tlb_remove_table(tlb, page); > #endif > } > > > Additional information: > > We have tested linux next by reverting this patch and confirmed > > that the reported BUG is not reproduced. > > > > These configs enabled on the running device, > > > > CONFIG_TRANSPARENT_HUGEPAGE=y > > CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y > > > > > > -- Suspecting patch -- > > commit 424efe723f7717430bec7c93b4d28bba73e31cf6 > > Author: Matthew Wilcox <willy@xxxxxxxxxxxxx> > > Date: Thu Aug 20 10:01:30 2020 +1000 > > > > mm: account PMD tables like PTE tables > > > > We account the PTE level of the page tables to the process in order to > > make smarter OOM decisions and help diagnose why memory is fragmented. > > For these same reasons, we should account pages allocated for PMDs. With > > larger process address spaces and ASLR, the number of PMDs in use is > > higher than it used to be so the inaccuracy is starting to matter. > > > > Link: http://lkml.kernel.org/r/20200627184642.GF25039@xxxxxxxxxxxxxxxxxxxx > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > > Reviewed-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > Cc: Abdul Haleem <abdhalee@xxxxxxxxxxxxxxxxxx> > > Cc: Andy Lutomirski <luto@xxxxxxxxxx> > > Cc: Arnd Bergmann <arnd@xxxxxxxx> > > Cc: Christophe Leroy <christophe.leroy@xxxxxxxxxx> > > Cc: Joerg Roedel <joro@xxxxxxxxxx> > > Cc: Max Filippov <jcmvbkbc@xxxxxxxxx> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > Cc: Satheesh Rajendran <sathnaga@xxxxxxxxxxxxxxxxxx> > > Cc: Stafford Horne <shorne@xxxxxxxxx> > > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > Signed-off-by: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx> > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index b0a15ee77b8a..a4e5b806347c 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2239,7 +2239,7 @@ static inline spinlock_t *pmd_lockptr(struct > > mm_struct *mm, pmd_t *pmd) > > return ptlock_ptr(pmd_to_page(pmd)); > > } > > > > -static inline bool pgtable_pmd_page_ctor(struct page *page) > > +static inline bool pmd_ptlock_init(struct page *page) > > { > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > > page->pmd_huge_pte = NULL; > > @@ -2247,7 +2247,7 @@ static inline bool pgtable_pmd_page_ctor(struct > > page *page) > > return ptlock_init(page); > > } > > > > -static inline void pgtable_pmd_page_dtor(struct page *page) > > +static inline void pmd_ptlock_free(struct page *page) > > { > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > > VM_BUG_ON_PAGE(page->pmd_huge_pte, page); > > @@ -2264,8 +2264,8 @@ static inline spinlock_t *pmd_lockptr(struct > > mm_struct *mm, pmd_t *pmd) > > return &mm->page_table_lock; > > } > > > > -static inline bool pgtable_pmd_page_ctor(struct page *page) { return true; } > > -static inline void pgtable_pmd_page_dtor(struct page *page) {} > > +static inline bool pmd_ptlock_init(struct page *page) { return true; } > > +static inline void pmd_ptlock_free(struct page *page) {} > > > > #define pmd_huge_pte(mm, pmd) ((mm)->pmd_huge_pte) > > > > @@ -2278,6 +2278,22 @@ static inline spinlock_t *pmd_lock(struct > > mm_struct *mm, pmd_t *pmd) > > return ptl; > > } > > > > +static inline bool pgtable_pmd_page_ctor(struct page *page) > > +{ > > + if (!pmd_ptlock_init(page)) > > + return false; > > + __SetPageTable(page); > > + inc_zone_page_state(page, NR_PAGETABLE); > > + return true; > > +} > > + > > +static inline void pgtable_pmd_page_dtor(struct page *page) > > +{ > > + pmd_ptlock_free(page); > > + __ClearPageTable(page); > > + dec_zone_page_state(page, NR_PAGETABLE); > > +} > > + > > /* > > * No scalability reason to split PUD locks yet, but follow the same pattern > > * as the PMD locks to make it easier if we decide to. The VM should not be > > > > > > > > > > - Naresh > > -- > Sincerely yours, > Mike.