On Tue, Oct 31, 2023 at 05:13:57PM -0700, Christoph Lameter (Ampere) wrote: > Hi Matthew, > > There is a strange warning on bootup related to folios. Seen it a couple of > times before. Why does this occur? Filesystems generally can't cope with failing to allocate a bufferhead. So the buffer head code sets __GFP_NOFAIL. That's better than trying to implement __GFP_NOFAIL semantics in the fs code, right? > [ 20.878110] Call trace: > [ 20.878111] get_page_from_freelist+0x214/0x17f8 > [ 20.878116] __alloc_pages+0x17c/0xe08 > [ 20.878120] __kmalloc_large_node+0xa0/0x170 > [ 20.878123] __kmalloc_node+0x120/0x1d0 > [ 20.878125] memcg_alloc_slab_cgroups+0x48/0xc0 Oho. It's not buffer's fault, specifically. memcg is allocating its own metadata for the slab. I decree this Not My Fault. > [ 20.878128] memcg_slab_post_alloc_hook+0xa8/0x1c8 > [ 20.878132] kmem_cache_alloc+0x18c/0x338 > [ 20.878135] alloc_buffer_head+0x28/0xa0 > [ 20.878138] folio_alloc_buffers+0xe8/0x1c0 > [ 20.878141] folio_create_empty_buffers+0x2c/0x1e8 > [ 20.878143] folio_create_buffers+0x58/0x80 > [ 20.878145] block_read_full_folio+0x80/0x450 > [ 20.878148] blkdev_read_folio+0x24/0x38 > [ 20.956921] filemap_read_folio+0x60/0x138 > [ 20.956925] do_read_cache_folio+0x180/0x298 > [ 20.965270] read_cache_page+0x24/0x90 > [ 20.965273] __arm64_sys_swapon+0x2e0/0x1208 > [ 20.965277] invoke_syscall+0x78/0x108 > [ 20.965282] el0_svc_common.constprop.0+0x48/0xf0 > [ 20.981702] do_el0_svc+0x24/0x38 > [ 20.993773] el0t_64_sync_handler+0x100/0x130 > [ 20.993776] el0t_64_sync+0x190/0x198 > [ 20.993779] ---[ end trace 0000000000000000 ]--- > [ 20.999972] Adding 999420k swap on /dev/mapper/eng07sys--r113--vg-swap_1. > Priority:-2 extents:1 across:999420k SS > > This is due to > > > > folio_alloc_buffers() setting GFP_NOFAIL: > > > struct buffer_head *folio_alloc_buffers(struct folio *folio, unsigned long > size, > bool retry) > { > struct buffer_head *bh, *head; > gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT; > long offset; > struct mem_cgroup *memcg, *old_memcg; > > if (retry) > gfp |= __GFP_NOFAIL; This isn't new. It was introduced by 640ab98fb362 in 2017. It seems reasonable to be able to kmalloc(512, GFP_NOFAIL). It's the memcg code which is having problems here.