> On Nov 27, 2019, at 9:13 AM, Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Wed 27-11-19 21:13:00, Kefeng Wang wrote: >> >> >> On 2019/11/27 19:47, Michal Hocko wrote: >>> On Wed 27-11-19 18:28:00, Kefeng Wang wrote: >>>> The start_pfn and end_pfn are already available in move_freepages_block(), >>>> pfn_valid_within() should validate pfn first before touching the page, >>>> or we might access an unitialized page with CONFIG_HOLES_IN_ZONE configs. >>>> >>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>>> Cc: Michal Hocko <mhocko@xxxxxxxx> >>>> Cc: Vlastimil Babka <vbabka@xxxxxxx> >>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> >>>> --- >>>> >>>> Here is an oops in 4.4(arm64 enabled CONFIG_HOLES_IN_ZONE), >>> >>> Is this reproducible with the current upstream kernel? There were large >>> changes in this aread since 4.4 >> >> Our inner tester found this oops twice, but couldn't be reproduced for now, >> even in 4.4 kernel, still trying... >> >> But the page_to_pfn() shouldn't be used in move_freepages(), right? ; ) > > Well, I do agree that going back and forth between page and pfn is ugly. > So this as a cleanup makes sense to me. But you are trying to fix a bug > and that bug should be explained. NULL ptr dereference sounds like a > memmap is not allocated for the particular pfn and this is a bit > unexpected even with holes, at least on x86, maybe arm64 allows that. > But the changelog should be clear about all this rather than paper over > a deeper problem potentially. Please also make sure to involve arm64 > people. Indeed. Too many times people are only able to reproduce the issues on old kernels but insist to forward-fix the mainline as well which only bring unstable there.