On Mon, Jun 13, 2022 at 11:23 PM Zi Yan <ziy@xxxxxxxxxx> wrote: > > Hi Xianting, > > Thanks for your patch. > > On 13 Jun 2022, at 9:10, Xianting Tian wrote: > > > Commit 787af64d05cd ("mm: page_alloc: validate buddy before check its migratetype.") > > added buddy check code. But unfortunately, this fix isn't backported to > > linux-5.17.y and the former stable branches. The reason is it added wrong > > fixes message: > > Fixes: 1dd214b8f21c ("mm: page_alloc: avoid merging non-fallbackable > > pageblocks with others") > > No, the Fixes tag is right. The commit above does need to validate buddy. I think Xianting is right. The “Fixes:" tag is not accurate and the page_is_buddy() is necessary here. This patch could be applied to the early version of the stable tree (eg: Linux-5.10.y, not the master tree) > > > Actually, this issue is involved by commit: > > commit d9dddbf55667 ("mm/page_alloc: prevent merging between isolated and other pageblocks") > > > > For RISC-V arch, the first 2M is reserved for sbi, so the start PFN is 512, > > but it got buddy PFN 0 for PFN 0x2000: > > 0 = 0x2000 ^ (1 << 12) > > With the illegal buddy PFN 0, it got an illegal buddy page, which caused > > crash in __get_pfnblock_flags_mask(). > > It seems that the RISC-V arch reveals a similar bug from d9dddbf55667. > Basically, this bug will only happen when PFN=0x2000 is merging up and > there are some isolated pageblocks. Not PFN=0x2000, it's PFN=0x1000, I guess. RISC-V's first 2MB RAM could reserve for opensbi, so it would have riscv_pfn_base=512 and mem_map began with 512th PFN when CONFIG_FLATMEM=y. (Also, csky has the same issue: a non-zero pfn_base in some scenarios.) But __find_buddy_pfn algorithm thinks the start address is 0, it could get 0 pfn or less than the pfn_base value. We need another check to prevent that. > > BTW, what does first reserved 2MB imply? All 4KB pages from first 2MB are > set to PageReserved? > > > > > With the patch, it can avoid the calling of get_pageblock_migratetype() if > > it isn't buddy page. > > You might miss the __find_buddy_pfn() caller in unset_migratetype_isolate() > from mm/page_isolation.c, if you are talking about linux-5.17.y and former > version. There, page_is_buddy() is also not called and is_migrate_isolate_page() > is called, which calls get_pageblock_migratetype() too. > > > > > Fixes: d9dddbf55667 ("mm/page_alloc: prevent merging between isolated and other pageblocks") > > Cc: stable@xxxxxxxxxxxxxxx > > Reported-by: zjb194813@xxxxxxxxxxxxxxx > > Reported-by: tianhu.hh@xxxxxxxxxxxxxxx > > Signed-off-by: Xianting Tian <xianting.tian@xxxxxxxxxxxxxxxxx> > > --- > > mm/page_alloc.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index b1caa1c6c887..5b423caa68fd 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1129,6 +1129,9 @@ static inline void __free_one_page(struct page *page, > > > > buddy_pfn = __find_buddy_pfn(pfn, order); > > buddy = page + (buddy_pfn - pfn); > > + > > + if (!page_is_buddy(page, buddy, order)) > > + goto done_merging; > > buddy_mt = get_pageblock_migratetype(buddy); > > > > if (migratetype != buddy_mt > > -- > > 2.17.1 > > -- > Best Regards, > Yan, Zi -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/