Right.. Well lets add the cgoup folks to this.
The code that simply uses the GFP_NOFAIL to allocate cgroup metadata
using an order > 1:
int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s,
gfp_t gfp, bool new_slab)
{
unsigned int objects = objs_per_slab(s, slab);
unsigned long memcg_data;
void *vec;
gfp &= ~OBJCGS_CLEAR_MASK;
vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp,
slab_nid(slab));
On Wed, 1 Nov 2023, Matthew Wilcox wrote:
On Tue, Oct 31, 2023 at 05:13:57PM -0700, Christoph Lameter (Ampere) wrote:
Hi Matthew,
There is a strange warning on bootup related to folios. Seen it a couple of
times before. Why does this occur?
Filesystems generally can't cope with failing to allocate a bufferhead.
So the buffer head code sets __GFP_NOFAIL. That's better than trying
to implement __GFP_NOFAIL semantics in the fs code, right?
[ 20.878110] Call trace:
[ 20.878111] get_page_from_freelist+0x214/0x17f8
[ 20.878116] __alloc_pages+0x17c/0xe08
[ 20.878120] __kmalloc_large_node+0xa0/0x170
[ 20.878123] __kmalloc_node+0x120/0x1d0
[ 20.878125] memcg_alloc_slab_cgroups+0x48/0xc0
Oho. It's not buffer's fault, specifically. memcg is allocating
its own metadata for the slab. I decree this Not My Fault.
[ 20.878128] memcg_slab_post_alloc_hook+0xa8/0x1c8
[ 20.878132] kmem_cache_alloc+0x18c/0x338
[ 20.878135] alloc_buffer_head+0x28/0xa0
[ 20.878138] folio_alloc_buffers+0xe8/0x1c0
[ 20.878141] folio_create_empty_buffers+0x2c/0x1e8
[ 20.878143] folio_create_buffers+0x58/0x80
[ 20.878145] block_read_full_folio+0x80/0x450
[ 20.878148] blkdev_read_folio+0x24/0x38
[ 20.956921] filemap_read_folio+0x60/0x138
[ 20.956925] do_read_cache_folio+0x180/0x298
[ 20.965270] read_cache_page+0x24/0x90
[ 20.965273] __arm64_sys_swapon+0x2e0/0x1208
[ 20.965277] invoke_syscall+0x78/0x108
[ 20.965282] el0_svc_common.constprop.0+0x48/0xf0
[ 20.981702] do_el0_svc+0x24/0x38
[ 20.993773] el0t_64_sync_handler+0x100/0x130
[ 20.993776] el0t_64_sync+0x190/0x198
[ 20.993779] ---[ end trace 0000000000000000 ]---
[ 20.999972] Adding 999420k swap on /dev/mapper/eng07sys--r113--vg-swap_1.
Priority:-2 extents:1 across:999420k SS
This is due to
folio_alloc_buffers() setting GFP_NOFAIL:
struct buffer_head *folio_alloc_buffers(struct folio *folio, unsigned long
size,
bool retry)
{
struct buffer_head *bh, *head;
gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT;
long offset;
struct mem_cgroup *memcg, *old_memcg;
if (retry)
gfp |= __GFP_NOFAIL;
This isn't new. It was introduced by 640ab98fb362 in 2017.
It seems reasonable to be able to kmalloc(512, GFP_NOFAIL). It's the
memcg code which is having problems here.