On Tue, Apr 12, 2022 at 01:16:23PM -0700, Sudarshan Rajagopalan wrote: > Check if pfn is valid before or not before moving it to freelist. > > There are possible scenario where a pageblock can have partial physical > hole and partial part of System RAM. This happens when base address in RAM > partition table is not aligned to pageblock size. > > Example: > > Say we have this first two entries in RAM partition table - > > Base Addr: 0x0000000080000000 Length: 0x0000000058000000 > Base Addr: 0x00000000E3930000 Length: 0x0000000020000000 I wonder what was done to memory DIMMs to get such an interesting physical memory layout... > ... > > Physical hole: 0xD8000000 - 0xE3930000 > > On system having 4K as page size and hence pageblock size being 4MB, the > base address 0xE3930000 is not aligned to 4MB pageblock size. > > Now we will have pageblock which has partial physical hole and partial part > of System RAM - > > Pageblock [0xE3800000 - 0xE3C00000] - > 0xE3800000 - 0xE3930000 -- physical hole > 0xE3930000 - 0xE3C00000 -- System RAM > > Now doing __alloc_pages say we get a valid page with PFN 0xE3B00 from > __rmqueue_fallback, we try to put other pages from the same pageblock as well > into freelist by calling steal_suitable_fallback(). > > We then search for freepages from start of the pageblock due to below code - > > move_freepages_block(zone, page, migratetype, ...) > { > pfn = page_to_pfn(page); > start_pfn = pfn & ~(pageblock_nr_pages - 1); > end_pfn = start_pfn + pageblock_nr_pages - 1; > ... > } > > With the pageblock which has partial physical hole at the beginning, we will > run into PFNs from the physical hole whose struct page is not initialized and > is invalid, and system would crash as we operate on invalid struct page to find > out of page is in Buddy or LRU or not struct page must be initialized and valid even for holes in the physical memory. When a pageblock spans both existing memory and a hole, the struct pages for the "hole" part should be marked as PG_Reserved. If you see that struct pages for memory holes exist but invalid, we should solve the underlying issue that causes wrong struct pages contents. > [ 107.629453][ T9688] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > [ 107.639214][ T9688] Mem abort info: > [ 107.642829][ T9688] ESR = 0x96000006 > [ 107.646696][ T9688] EC = 0x25: DABT (current EL), IL = 32 bits > [ 107.652878][ T9688] SET = 0, FnV = 0 > [ 107.656751][ T9688] EA = 0, S1PTW = 0 > [ 107.660705][ T9688] FSC = 0x06: level 2 translation fault > [ 107.666455][ T9688] Data abort info: > [ 107.670151][ T9688] ISV = 0, ISS = 0x00000006 > [ 107.674827][ T9688] CM = 0, WnR = 0 > [ 107.678615][ T9688] user pgtable: 4k pages, 39-bit VAs, pgdp=000000098a237000 > [ 107.685970][ T9688] [0000000000000000] pgd=0800000987170003, p4d=0800000987170003, pud=0800000987170003, pmd=0000000000000000 > [ 107.697582][ T9688] Internal error: Oops: 96000006 [#1] PREEMPT SMP > > [ 108.209839][ T9688] pc : move_freepages_block+0x174/0x27c can you post fadd2line for this address? > [ 108.215407][ T9688] lr : steal_suitable_fallback+0x20c/0x398 > > [ 108.305908][ T9688] Call trace: > [ 108.309151][ T9688] move_freepages_block+0x174/0x27c [PageLRU] > [ 108.314359][ T9688] steal_suitable_fallback+0x20c/0x398 > [ 108.319826][ T9688] rmqueue_bulk+0x250/0x934 > [ 108.324325][ T9688] rmqueue_pcplist+0x178/0x2ac > [ 108.329086][ T9688] rmqueue+0x5c/0xc10 > [ 108.333048][ T9688] get_page_from_freelist+0x19c/0x430 > [ 108.338430][ T9688] __alloc_pages+0x134/0x424 > [ 108.343017][ T9688] page_cache_ra_unbounded+0x120/0x324 > [ 108.348494][ T9688] do_sync_mmap_readahead+0x1b0/0x234 > [ 108.353878][ T9688] filemap_fault+0xe0/0x4c8 > [ 108.358375][ T9688] do_fault+0x168/0x6cc > [ 108.362518][ T9688] handle_mm_fault+0x5c4/0x848 > [ 108.367280][ T9688] do_page_fault+0x3fc/0x5d0 > [ 108.371867][ T9688] do_translation_fault+0x6c/0x1b0 > [ 108.376985][ T9688] do_mem_abort+0x68/0x10c > [ 108.381389][ T9688] el0_ia+0x50/0xbc > [ 108.385175][ T9688] el0t_32_sync_handler+0x88/0xbc > [ 108.390208][ T9688] el0t_32_sync+0x1b8/0x1bc > > Hence, avoid operating on invalid pages within the same pageblock by checking > if pfn is valid or not. > Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@xxxxxxxxxxx> > Fixes: 4c7b9896621be ("mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE") > Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx> For now the patch looks like a band-aid for more fundamental bug, so NAKED-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx> > Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx> > --- > mm/page_alloc.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 6e5b448..e87aa053 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2521,6 +2521,11 @@ static int move_freepages(struct zone *zone, > int pages_moved = 0; > > for (pfn = start_pfn; pfn <= end_pfn;) { > + if (!pfn_valid(pfn)) { > + pfn++; > + continue; > + } > + > page = pfn_to_page(pfn); > if (!PageBuddy(page)) { > /* > -- > 2.7.4 > -- Sincerely yours, Mike.