On 2019/10/8 17:12, Vlastimil Babka wrote: > On 10/8/19 10:35 AM, Kefeng Wang wrote: >> Hi Vlastimil and all, >> >> We met a Null pointer when do page_to_pfn() in __free_one_page() in older kernel, __nr_to_section(__sec) return NULL, >> >> #define __page_to_pfn(pg) \ >> ({ const struct page *__pg = (pg); \ >> int __sec = page_to_section(__pg); \ >> (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \ >> }) >> >> Before v4.11, __free_one_page() use pfn_valid_within(page_to_pfn(buddy)) to check pfn, after the > > Hmm, looks like the code before v4.11 was wrong. pfn_valid_(within) > should be checked first, before obtaining and working with the struct > page. Here we already have a struct page obtained by pointer arithmetics, > and are calling page_to_pfn() on it, which means accessing its flags. > The pfn_valid_within() then comes too late. > >> following patches, it use buddy_pfn directly. >> >> b4fb8f66f1ae mm, page_alloc: Add missing check for memory holes >> 13ad59df67f1 mm, page_alloc: avoid page_to_pfn() when merging buddies // NOTE: directly use buddy_pfn >> 76741e776a37 mm, page_alloc: don't convert pfn to idx when merging > > Looks like my patches fixed that code without realizing there was > the bug. Commit b4fb8f66f1ae shows I've also introduced it elsewhere. > >> If use page_to_pfn(buddy) back in mainline 5.4-rc2, the same issue will occur, > > No surprise, we must validate pfn first before touching the page. > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index c0b2e0306720..fbbfe8e8b4ca 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -934,7 +934,7 @@ static inline void __free_one_page(struct page *page, >> buddy_pfn = __find_buddy_pfn(pfn, order); >> buddy = page + (buddy_pfn - pfn); >> >> - if (!pfn_valid_within(buddy_pfn)) >> + if (!pfn_valid_within(page_to_pfn(buddy))) >> goto done_merging; >> if (!page_is_buddy(page, buddy, order)) >> goto done_merging; >> >> It shows the buddy->flags is wrong, that is, buddy is bad, we find the buddy by page + (buddy_pfn - pfn), >> so there is some issue in __find_buddy_pfn(pfn, order)? > > No, result of __find_buddy_pfn has to be validated first. Hi Vlastimil, Thank you for your explanation, got it. > >> The following is the debug print and CallTrace, any comment? >> >> Thanks, >> Kefeng >> >> 1) MEMBLOCK configuration: >> memory size = 0x0000000036800000 reserved size = 0x0000000004ca7fbc >> memory.cnt = 0x4 >> memory[0x0] [0x0000000000000000-0x0000000013ffffff], 0x0000000014000000 bytes flags: 0x0 >> memory[0x1] [0x000000002d600000-0x0000000033ffffff], 0x0000000006a00000 bytes flags: 0x0 >> memory[0x2] [0x0000000034800000-0x00000000445fffff], 0x000000000fe00000 bytes flags: 0x0 >> memory[0x3] [0x0000000044a00000-0x00000000509fffff], 0x000000000c000000 bytes flags: 0x0 >> reserved.cnt = 0x6 >> reserved[0x0] [0x000000002d680000-0x000000002e7c8fff], 0x0000000001149000 bytes flags: 0x0 >> reserved[0x1] [0x0000000030b00000-0x000000003239cfff], 0x000000000189d000 bytes flags: 0x0 >> reserved[0x2] [0x0000000032400000-0x00000000324fffff], 0x0000000000100000 bytes flags: 0x0 >> reserved[0x3] [0x000000004e000000-0x000000004fffffff], 0x0000000002000000 bytes flags: 0x0 >> reserved[0x4] [0x000000005083e040-0x0000000050849ffb], 0x000000000000bfbc bytes flags: 0x0 >> reserved[0x5] [0x000000005084a000-0x00000000509fffff], 0x00000000001b6000 bytes flags: 0x0 > > These might be holes in the zones, right. > >> 2) CONFIG >> CONFIG_SPARSEMEM_MANUAL=y >> CONFIG_SPARSEMEM=y >> CONFIG_HAVE_MEMORY_PRESENT=y >> CONFIG_SPARSEMEM_EXTREME=y >> CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y >> # CONFIG_SPARSEMEM_VMEMMAP is not set > > Is CONFIG_HOLES_IN_ZONE enabled? Probably yes as that's arm64. Yes, arm64 force enable HOLES_IN_ZONE. > >> 3) debug print and CallTrace >> __free_one_page , order = 9, max_order = 11, page = ffffff804e128000, buddy = ffffff804e120000, sec = 42623, mem_section = 0000000000000000 > > I would assume buddy is in one of the holes, but you'd have to print the pfn's to be sure. buddy_pfn = 280576, buddy_addr = 44800000, and it is in the hole between memory[2] and memory[3]. > >> buddy = ffffff804e120000, __sec = 42623, buddy->flags = 299fcc27aebc552f, SECTIONS_PGSHIFT = 46, SECTIONS_MASK = 3ffff >> __section_mem_map_addr, section = 0000000000000000