On 27.11.19 15:13, Michal Hocko wrote: > On Wed 27-11-19 21:13:00, Kefeng Wang wrote: >> >> >> On 2019/11/27 19:47, Michal Hocko wrote: >>> On Wed 27-11-19 18:28:00, Kefeng Wang wrote: >>>> The start_pfn and end_pfn are already available in move_freepages_block(), >>>> pfn_valid_within() should validate pfn first before touching the page, >>>> or we might access an unitialized page with CONFIG_HOLES_IN_ZONE configs. >>>> >>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>>> Cc: Michal Hocko <mhocko@xxxxxxxx> >>>> Cc: Vlastimil Babka <vbabka@xxxxxxx> >>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> >>>> --- >>>> >>>> Here is an oops in 4.4(arm64 enabled CONFIG_HOLES_IN_ZONE), >>> >>> Is this reproducible with the current upstream kernel? There were large >>> changes in this aread since 4.4 >> >> Our inner tester found this oops twice, but couldn't be reproduced for now, >> even in 4.4 kernel, still trying... >> >> But the page_to_pfn() shouldn't be used in move_freepages(), right? ; ) > > Well, I do agree that going back and forth between page and pfn is ugly. > So this as a cleanup makes sense to me. But you are trying to fix a bug > and that bug should be explained. NULL ptr dereference sounds like a > memmap is not allocated for the particular pfn and this is a bit > unexpected even with holes, at least on x86, maybe arm64 allows that. AFAIK ARM allows that. (and arm64) It's basically CONFIG_HAVE_ARCH_PFN_VALID (and CONFIG_HOLES_IN_ZONE if I am not wrong) commit eb33575cf67d3f35fa2510210ef92631266e2465 Author: Mel Gorman <mel@xxxxxxxxx> Date: Wed May 13 17:34:48 2009 +0100 [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 pfn_valid() is meant to be able to tell if a given PFN has valid memmap associated with it or not. In FLATMEM, it is expected that holes always have valid memmap as long as there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed that a valid section has a memmap for the entire section. However, ARM and maybe other embedded architectures in the future free memmap backing holes to save memory on the assumption the memmap is never used. The page_zone linkages are then broken even though pfn_valid() returns true. A walker of the full memmap must then do this additional check to ensure the memmap they are looking at is sane by making sure the zone and PFN linkages are still valid. This is expensive, but walkers of the full memmap are extremely rare. [...] And commit 7b7bf499f79de3f6c85a340c8453a78789523f85 Author: Will Deacon <will.deacon@xxxxxxx> Date: Thu May 19 13:21:14 2011 +0100 ARM: 6913/1: sparsemem: allow pfn_valid to be overridden when using SPARSEMEM Side note: I find overriding pfn_valid() extremely ugly ... ... and CONFIG_HOLES_IN_ZONE as well. > But the changelog should be clear about all this rather than paper over > a deeper problem potentially. Please also make sure to involve arm64 > people. > -- Thanks, David / dhildenb