On 6/6/23 09:40, Lorenzo Stoakes wrote: > On Tue, Jun 06, 2023 at 09:13:24AM +0200, Vlastimil Babka wrote: >> >> On 6/5/23 22:11, Lorenzo Stoakes wrote: >>> In __vmalloc_area_node() we always warn_alloc() when an allocation >>> performed by vm_area_alloc_pages() fails unless it was due to a pending >>> fatal signal. >>> >>> However, huge page allocations instigated either by vmalloc_huge() or >>> __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or >>> kvmalloc_node()) always falls back to order-0 allocations if the huge page >>> allocation fails. >>> >>> This renders the warning useless and noisy, especially as all callers >>> appear to be aware that this may fallback. This has already resulted in at >>> least one bug report from a user who was confused by this (see link). >>> >>> Therefore, simply update the code to only output this warning for order-0 >>> pages when no fatal signal is pending. >>> >>> Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410 >>> Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx> >> >> I think there are more reports of same thing from the btrfs context, that >> appear to be a 6.3 regression >> >> https://bugzilla.kernel.org/show_bug.cgi?id=217466 >> Link: https://lore.kernel.org/all/efa04d56-cd7f-6620-bca7-1df89f49bf4b@xxxxxxxxx/ >> >> If this indeed helps, it would make sense to Cc: stable here. Although I >> don't see what caused the regression, the warning itself is not new, so is >> it new source of order-9 attempts in vmalloc() or new reasons why order-9 >> pages would not be possible to allocate? > > Linus updated kvmalloc() to use huge vmalloc() allocations in 9becb6889130 > ("kvmalloc: use vmalloc_huge for vmalloc allocations") and Song update > alloc_large_system_hash() to as well in f2edd118d02d ("page_alloc: use > vmalloc_huge for large system hash") both of which are ~1y old, however > these would impact ~5.18, so it's weird to see reports citing 6.2 -> 6.3. > > Will dig to see if something else changed that would increase the > prevalence of this. I think I found the commit from 6.3 that effectively exposed this warning. As this is a tracked regression I would really suggest moving the fix to mm-hotfixes instead of mm-unstable, and Fixes: 80b1d8fdfad1 ("mm: vmalloc: correct use of __GFP_NOWARN mask in __vmalloc_area_node()") Cc: <stable@xxxxxxxxxxxxxxx> > Also while we're here, ugh at us immediately splitting the non-compound > (also ugh) huge page. Nicholas explains why in the patch that introduces it > - 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split > rather than compound") - but it'd be nice if we could find a way to avoid > this. > > If only there were a data type (perhaps beginning with 'f') that abstracted > the order of the page entirely and could be guaranteed to always be the one > with which you manipulated ref count, etc... ;) > >> >>> --- >>> mm/vmalloc.c | 17 +++++++++++++---- >>> 1 file changed, 13 insertions(+), 4 deletions(-) >>> >>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c >>> index ab606a80f475..e563f40ad379 100644 >>> --- a/mm/vmalloc.c >>> +++ b/mm/vmalloc.c >>> @@ -3149,11 +3149,20 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, >>> * allocation request, free them via vfree() if any. >>> */ >>> if (area->nr_pages != nr_small_pages) { >>> - /* vm_area_alloc_pages() can also fail due to a fatal signal */ >>> - if (!fatal_signal_pending(current)) >>> + /* >>> + * vm_area_alloc_pages() can fail due to insufficient memory but >>> + * also:- >>> + * >>> + * - a pending fatal signal >>> + * - insufficient huge page-order pages >>> + * >>> + * Since we always retry allocations at order-0 in the huge page >>> + * case a warning for either is spurious. >>> + */ >>> + if (!fatal_signal_pending(current) && page_order == 0) >>> warn_alloc(gfp_mask, NULL, >>> - "vmalloc error: size %lu, page order %u, failed to allocate pages", >>> - area->nr_pages * PAGE_SIZE, page_order); >>> + "vmalloc error: size %lu, failed to allocate pages", >>> + area->nr_pages * PAGE_SIZE); >>> goto fail; >>> } >>> >>