On Mon, May 22, 2023 at 06:00:42PM +0200, Uladzislau Rezki wrote: > > Hi, > > > > I notice a regression report on Bugzilla [1]. Quoting from it: > > > > > after updating from 6.2.x to 6.3.x, vmalloc error messages started to appear in the dmesg > > > > > > > > > > > > # free > > > total used free shared buff/cache available > > > Mem: 16183724 1473068 205664 33472 14504992 14335700 > > > Swap: 16777212 703596 16073616 > > > > > > > > > (zswap enabled) > > > > See bugzilla for the full thread and attached dmesg. > > > > On the report, the reporter can't perform the required bisection, > > unfortunately. > > > > Anyway, I'm adding it to regzbot: > > > > #regzbot introduced: v6.2..v6.3 https://bugzilla.kernel.org/show_bug.cgi?id=217466 > > #regzbot title: btrfs_work_helper dealloc error in v6.3.x > > > > Thanks. > > > > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217466 > > > According to dmesg output from the bugzilla, the vmalloc tries to > allocate high order pages: 1 << 9. Since it fails to get a order-9 page > you get the warning: That we want a order 9 is intentional, it's for a compression workspace (bugzilla comment 5). It's allocated as kvzalloc i.e. with the fallback to vmalloc in case the first one fails. > <snip> > if (area->nr_pages != nr_small_pages) { > /* vm_area_alloc_pages() can also fail due to a fatal signal */ > if (!fatal_signal_pending(current)) > warn_alloc(gfp_mask, NULL, > "vmalloc error: size %lu, page order %u, failed to allocate pages", > area->nr_pages * PAGE_SIZE, page_order); > goto fail; > } > <snip> > > and it fails. > > If the __GFP_NOFAIL is passed, the vm_area_alloc_pages() function switches > to allocate 0-order pages instead. I think the fix is to call the > kvmalloc_node() with __GFP_NOFAIL flag. __GFP_NOFAIL does not make sense here and we've tried hard not to used it anywhere because of the deadlocky effects. Did you mean __GFP_NOWARN? That's a patch I sent today but there's another comment in the bugzilla that we got more allocation warnings for huge (2M) allocations, this time it was for a deduplication ioctl. This seems to be a noticeable change in 6.3, before we disable the warning in our code I think the MM guys could have a look. So far it seems we're about to paper of a problem.