On Fri 09-08-24 11:30:32, Michal Hocko wrote: > On Thu 08-08-24 20:00:58, Hailong Liu wrote: > > The __vmap_pages_range_noflush() assumes its argument pages** contains > > pages with the same page shift. However, since commit e9c3cda4d86e > > (mm, vmalloc: fix high order __GFP_NOFAIL allocations), if gfp_flags > > includes __GFP_NOFAIL with high order in vm_area_alloc_pages() > > and page allocation failed for high order, the pages** may contain > > two different page shifts (high order and order-0). This could > > lead __vmap_pages_range_noflush() to perform incorrect mappings, > > potentially resulting in memory corruption. > > > > Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): > > kvmalloc(2M, __GFP_NOFAIL|GFP_X) > > __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) > > vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 > > vmap_pages_range() > > vmap_pages_range_noflush() > > __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens > > > > We can remove the fallback code because if a high-order > > allocation fails, __vmalloc_node_range_noprof() will retry with > > order-0. Therefore, it is unnecessary to fallback to order-0 > > here. Therefore, fix this by removing the fallback code. > > > > Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") > > Signed-off-by: Hailong Liu <hailong.liu@xxxxxxxx> > > Reported-by: Tangquan.Zheng <zhengtangquan@xxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> > > CC: Barry Song <21cnbao@xxxxxxxxx> > > CC: Baoquan He <bhe@xxxxxxxxxx> > > CC: Matthew Wilcox <willy@xxxxxxxxxxxxx> > > --- > > mm/vmalloc.c | 11 ++--------- > > mm/vmalloc.c.rej | 10 ++++++++++ > > What is this? > > > 2 files changed, 12 insertions(+), 9 deletions(-) > > create mode 100644 mm/vmalloc.c.rej > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 6b783baf12a1..af2de36549d6 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -3584,15 +3584,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > > page = alloc_pages_noprof(alloc_gfp, order); > > else > > page = alloc_pages_node_noprof(nid, alloc_gfp, order); > > - if (unlikely(!page)) { > > - if (!nofail) > > - break; > > - > > - /* fall back to the zero order allocations */ > > - alloc_gfp |= __GFP_NOFAIL; > > - order = 0; > > - continue; > > - } > > + if (unlikely(!page)) > > + break; > > This just makes the NOFAIL allocation fail. So this is not a correct > fix. OK, I can see a newer version -- Michal Hocko SUSE Labs