Re: Fwd: vmalloc error: btrfs-delalloc btrfs_work_helper [btrfs] in kernel 6.3.x

David Sterba <dsterba@xxxxxxx> · Mon, 22 May 2023 21:09:36 +0200

On Mon, May 22, 2023 at 06:00:42PM +0200, Uladzislau Rezki wrote:
> > Hi,
> > 
> > I notice a regression report on Bugzilla [1]. Quoting from it:
> > 
> > > after updating from 6.2.x to 6.3.x, vmalloc error messages started to appear in the dmesg
> > > 
> > > 
> > > 
> > > # free 
> > >                total        used        free      shared  buff/cache   available
> > > Mem:        16183724     1473068      205664       33472    14504992    14335700
> > > Swap:       16777212      703596    16073616
> > > 
> > > 
> > > (zswap enabled)
> > 
> > See bugzilla for the full thread and attached dmesg.
> > 
> > On the report, the reporter can't perform the required bisection,
> > unfortunately.
> > 
> > Anyway, I'm adding it to regzbot:
> > 
> > #regzbot introduced: v6.2..v6.3 https://bugzilla.kernel.org/show_bug.cgi?id=217466
> > #regzbot title: btrfs_work_helper dealloc error in v6.3.x
> > 
> > Thanks.
> > 
> > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217466
> > 
> According to dmesg output from the bugzilla, the vmalloc tries to
> allocate high order pages: 1 << 9. Since it fails to get a order-9 page
> you get the warning:

That we want a order 9 is intentional, it's for a compression workspace
(bugzilla comment 5). It's allocated as kvzalloc i.e. with the fallback
to vmalloc in case the first one fails.

> <snip>
> 	if (area->nr_pages != nr_small_pages) {
> 		/* vm_area_alloc_pages() can also fail due to a fatal signal */
> 		if (!fatal_signal_pending(current))
> 			warn_alloc(gfp_mask, NULL,
> 				"vmalloc error: size %lu, page order %u, failed to allocate pages",
> 				area->nr_pages * PAGE_SIZE, page_order);
> 		goto fail;
> 	}
> <snip>
> 
> and it fails.
> 
> If the __GFP_NOFAIL is passed, the vm_area_alloc_pages() function switches
> to allocate 0-order pages instead. I think the fix is to call the
> kvmalloc_node() with __GFP_NOFAIL flag.

__GFP_NOFAIL does not make sense here and we've tried hard not to used
it anywhere because of the deadlocky effects. Did you mean __GFP_NOWARN?
That's a patch I sent today but there's another comment in the bugzilla
that we got more allocation warnings for huge (2M) allocations, this
time it was for a deduplication ioctl.

This seems to be a noticeable change in 6.3, before we disable the
warning in our code I think the MM guys could have a look. So far it
seems we're about to paper of a problem.