On 1/21/19 1:19 PM, Michal Hocko wrote: > On Mon 21-01-19 11:38:49, Qian Cai wrote: >> >> >> On 1/21/19 4:53 AM, Michal Hocko wrote: >>> On Thu 17-01-19 21:16:50, Qian Cai wrote: > [...] >>>> Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to >>>> have holes") >>> >>> Did you mean >>> Fixes: 9f1eb38e0e11 ("mm, kmemleak: little optimization while scanning") >> >> No, pfn_to_online_page() missed a few checks compared to pfn_valid() at least on >> arm64 where the returned pfn is no longer valid (where pfn_valid() will skip those). >> >> 2d070eab2e82 introduced pfn_to_online_page(), so it was targeted to fix it. > > But it is 9f1eb38e0e11 which has replaced pfn_valid by > pfn_to_online_page. Well, the comment of pfn_to_online_page() said, /* * Return page for the valid pfn only if the page is online. * All pfn walkers which rely on the fully initialized * page->flags and others should use this rather than * pfn_valid && pfn_to_page */ That seems incorrect to me in the first place, as it currently not return "fully initialized page->flags" pages in arm64. Once this fixed, there is no problem with 9f1eb38e0e11. It seems to me 9f1eb38e0e11 just depends on a broken interface, so it is better to fix the broken interface. > >> >>> >>>> Signed-off-by: Qian Cai <cai@xxxxxx> >>>> --- >>>> include/linux/memory_hotplug.h | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h >>>> index 07da5c6c5ba0..b8b36e6ac43b 100644 >>>> --- a/include/linux/memory_hotplug.h >>>> +++ b/include/linux/memory_hotplug.h >>>> @@ -26,7 +26,7 @@ struct vmem_altmap; >>>> struct page *___page = NULL; \ >>>> unsigned long ___nr = pfn_to_section_nr(pfn); \ >>>> \ >>>> - if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr))\ >>>> + if (online_section_nr(___nr) && pfn_valid(pfn)) \ >>>> ___page = pfn_to_page(pfn); \ >>> >>> Why have you removed the bound check? Is this safe? >>> Regarding the fix, I am not really sure TBH. If the secion is online >>> then we assume all struct pages to be initialized. If anything this >>> should be limited to werid arches which might have holes so >>> pfn_valid_within(). >> >> It looks to me at least on arm64 and x86_64, it has done this check in >> pfn_valid() already. >> >> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> return 0 > > But an everflow could happen before pfn_valid is evaluated, no? > I guess you mean "overflow". I'll probably keep that check and use pfn_valid_within() anyway, so I could optimize the checking if CONFIG_HOLES_IN_ZONE=n.