On Wed, Aug 19, 2015 at 12:21:45PM +0300, Kirill A. Shutemov wrote: > Hugh has pointed that compound_head() call can be unsafe in some > context. There's one example: > > CPU0 CPU1 > > isolate_migratepages_block() > page_count() > compound_head() > !!PageTail() == true > put_page() > tail->first_page = NULL > head = tail->first_page > alloc_pages(__GFP_COMP) > prep_compound_page() > tail->first_page = head > __SetPageTail(p); > !!PageTail() == true > <head == NULL dereferencing> > > The race is pure theoretical. I don't it's possible to trigger it in > practice. But who knows. > > We can fix the race by changing how encode PageTail() and compound_head() > within struct page to be able to update them in one shot. > > The patch introduces page->compound_head into third double word block in > front of compound_dtor and compound_order. That means it shares storage > space with: > > - page->lru.next; > - page->next; > - page->rcu_head.next; > - page->pmd_huge_pte; > > That's too long list to be absolutely sure, but looks like nobody uses > bit 0 of the word. It can be used to encode PageTail(). And if the bit > set, rest of the word is pointer to head page. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: David Rientjes <rientjes@xxxxxxxxxx> > Cc: Vlastimil Babka <vbabka@xxxxxxx> If DEFERRED_STRUCT_PAGE_INIT=n, combining this patchset with my page-flags patches causes oops in SetPageReserved() called from reserve_bootmem_region(). It happens because we haven't yet initilized the word in struct page and PageTail() inside SetPageReserved() can give false-positive, which leads to bogus compound_head() result. IIUC, we initialize the word only on first allocation of the page. It can be too late: pfn scanner can see false-positive PageTail() from not yet allocated pages too. Here's fixlet for patch to address the issue. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 347724850665..d0e3fca830f8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -892,6 +892,8 @@ static void init_reserved_page(unsigned long pfn) #else static inline void init_reserved_page(unsigned long pfn) { + /* Avoid false-positive PageTail() */ + INIT_LIST_HEAD(&pfn_to_page(pfn)->lru); } #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>