Ok, I set the mount options to 32MB and after a fresh VM reboot I got a page-allocation error again. This time it was the "first" kernel stacktrace after reboot - no "hung tasks for more than 120s prior". [Sat Apr 11 17:52:58 2020] kworker/7:1: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 [Sat Apr 11 17:52:58 2020] CPU: 7 PID: 2638 Comm: kworker/7:1 Not tainted 5.4.30 #5 [Sat Apr 11 17:52:58 2020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [Sat Apr 11 17:52:58 2020] Workqueue: ceph-msgr ceph_con_workfn [libceph] [Sat Apr 11 17:52:58 2020] Call Trace: [Sat Apr 11 17:52:58 2020] <IRQ> [Sat Apr 11 17:52:58 2020] dump_stack+0x6d/0x95 [Sat Apr 11 17:52:58 2020] warn_alloc+0x10c/0x170 [Sat Apr 11 17:52:58 2020] __alloc_pages_slowpath+0xe6c/0xef0 [Sat Apr 11 17:52:58 2020] ? ip_local_deliver_finish+0x48/0x50 [Sat Apr 11 17:52:58 2020] ? ip_local_deliver+0x6f/0xe0 [Sat Apr 11 17:52:58 2020] ? tcp_v4_early_demux+0x11c/0x150 [Sat Apr 11 17:52:58 2020] __alloc_pages_nodemask+0x2f3/0x360 [Sat Apr 11 17:52:58 2020] page_frag_alloc+0x129/0x150 [Sat Apr 11 17:52:58 2020] __napi_alloc_skb+0x86/0xd0 [Sat Apr 11 17:52:58 2020] page_to_skb+0x67/0x350 [virtio_net] [Sat Apr 11 17:52:58 2020] receive_buf+0xe47/0x16c0 [virtio_net] [Sat Apr 11 17:52:58 2020] virtnet_poll+0xf2/0x364 [virtio_net] [Sat Apr 11 17:52:58 2020] net_rx_action+0x265/0x3e0 [Sat Apr 11 17:52:58 2020] __do_softirq+0xf9/0x2aa [Sat Apr 11 17:52:58 2020] irq_exit+0xae/0xb0 [Sat Apr 11 17:52:58 2020] do_IRQ+0x59/0xe0 [Sat Apr 11 17:52:58 2020] common_interrupt+0xf/0xf [Sat Apr 11 17:52:58 2020] </IRQ> [Sat Apr 11 17:52:58 2020] RIP: 0010:__check_object_size+0x40/0x1b0 [Sat Apr 11 17:52:58 2020] Code: 4c 8d 34 3e 41 55 41 54 44 0f b6 ea 49 8d 46 ff 53 49 89 f4 48 89 fb 48 39 c7 0f 87 48 01 00 00 48 83 ff 10 0f 86 54 01 00 00 <e8> 0b fe ff ff 85 c0 74 27 78 0f 83 f8 02 7f 0a 5b 41 5c 41 5d 41 [Sat Apr 11 17:52:58 2020] RSP: 0018:ffffb96c00c479f8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffffdc [Sat Apr 11 17:52:58 2020] RAX: ffff92b9f36a7aca RBX: ffff92b9f36a7ab6 RCX: ffffb96c00c47c58 [Sat Apr 11 17:52:58 2020] RDX: 0000000000000001 RSI: 0000000000000015 RDI: ffff92b9f36a7ab6 [Sat Apr 11 17:52:58 2020] RBP: ffffb96c00c47a18 R08: 0000000000000000 R09: 0000000000000000 [Sat Apr 11 17:52:58 2020] R10: 0000000000000015 R11: ffff92bbea8957f8 R12: 0000000000000015 [Sat Apr 11 17:52:58 2020] R13: 0000000000000001 R14: ffff92b9f36a7acb R15: 0000000000000005 [Sat Apr 11 17:52:58 2020] simple_copy_to_iter+0x2a/0x50 [Sat Apr 11 17:52:58 2020] __skb_datagram_iter+0x162/0x280 [Sat Apr 11 17:52:58 2020] ? skb_kill_datagram+0x70/0x70 [Sat Apr 11 17:52:58 2020] skb_copy_datagram_iter+0x40/0xa0 [Sat Apr 11 17:52:58 2020] tcp_recvmsg+0x6f7/0xbc0 [Sat Apr 11 17:52:58 2020] ? aa_sk_perm+0x43/0x190 [Sat Apr 11 17:52:58 2020] inet_recvmsg+0x64/0xf0 [Sat Apr 11 17:52:58 2020] sock_recvmsg+0x66/0x70 [Sat Apr 11 17:52:58 2020] ceph_tcp_recvmsg+0x6f/0xa0 [libceph] [Sat Apr 11 17:52:58 2020] read_partial.isra.26+0x50/0x80 [libceph] [Sat Apr 11 17:52:58 2020] read_partial_message+0x198/0x7c0 [libceph] [Sat Apr 11 17:52:58 2020] ceph_con_workfn+0xa6a/0x23d0 [libceph] [Sat Apr 11 17:52:58 2020] ? __dir_lease_try_check+0x90/0x90 [ceph] [Sat Apr 11 17:52:58 2020] process_one_work+0x167/0x400 [Sat Apr 11 17:52:58 2020] worker_thread+0x4d/0x460 [Sat Apr 11 17:52:58 2020] kthread+0x105/0x140 [Sat Apr 11 17:52:58 2020] ? rescuer_thread+0x370/0x370 [Sat Apr 11 17:52:58 2020] ? kthread_destroy_worker+0x50/0x50 [Sat Apr 11 17:52:58 2020] ret_from_fork+0x35/0x40 [Sat Apr 11 17:52:58 2020] Mem-Info: [Sat Apr 11 17:52:58 2020] active_anon:6795 inactive_anon:8914 isolated_anon:0 active_file:6364 inactive_file:3914973 isolated_file:0 unevictable:0 dirty:25 writeback:0 unstable:0 slab_reclaimable:52533 slab_unreclaimable:45778 mapped:5289 shmem:564 pagetables:1216 bounce:0 free:24896 free_pcp:5276 free_cma:0 [Sat Apr 11 17:52:58 2020] Node 0 active_anon:27180kB inactive_anon:35656kB active_file:25456kB inactive_file:15659892kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:21156kB dirty:100kB writeback:0kB shmem:2256kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no [Sat Apr 11 17:52:58 2020] Node 0 DMA free:15908kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [Sat Apr 11 17:52:58 2020] lowmem_reserve[]: 0 3444 15930 15930 15930 [Sat Apr 11 17:52:58 2020] Node 0 DMA32 free:55068kB min:14596kB low:18244kB high:21892kB active_anon:1940kB inactive_anon:2316kB active_file:1748kB inactive_file:3386696kB unevictable:0kB writepending:56kB present:3653608kB managed:3588072kB mlocked:0kB kernel_stack:16kB pagetables:28kB bounce:0kB free_pcp:11168kB local_pcp:1552kB free_cma:0kB [Sat Apr 11 17:52:58 2020] lowmem_reserve[]: 0 0 12485 12485 12485 [Sat Apr 11 17:52:58 2020] Node 0 Normal free:28608kB min:77492kB low:90720kB high:103948kB active_anon:25240kB inactive_anon:33340kB active_file:23708kB inactive_file:12273660kB unevictable:0kB writepending:44kB present:13107200kB managed:12793540kB mlocked:0kB kernel_stack:7504kB pagetables:4836kB bounce:0kB free_pcp:9936kB local_pcp:1360kB free_cma:0kB [Sat Apr 11 17:52:58 2020] lowmem_reserve[]: 0 0 0 0 0 [Sat Apr 11 17:52:58 2020] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB [Sat Apr 11 17:52:58 2020] Node 0 DMA32: 17*4kB (MEH) 22*8kB (UMH) 18*16kB (MH) 2*32kB (ME) 147*64kB (UME) 70*128kB (UME) 16*256kB (UM) 15*512kB (UME) 4*1024kB (UME) 2*2048kB (M) 4*4096kB (M) = 55316kB [Sat Apr 11 17:52:58 2020] Node 0 Normal: 187*4kB (UMEH) 62*8kB (UMEH) 280*16kB (UMEH) 318*32kB (UMEH) 98*64kB (UMEH) 45*128kB (UMEH) 4*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 28956kB [Sat Apr 11 17:52:58 2020] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [Sat Apr 11 17:52:58 2020] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [Sat Apr 11 17:52:58 2020] 3922168 total pagecache pages [Sat Apr 11 17:52:58 2020] 279 pages in swap cache [Sat Apr 11 17:52:58 2020] 279 pages in swap cache [Sat Apr 11 17:52:58 2020] Swap cache stats: add 4764, delete 4485, find 347/500 [Sat Apr 11 17:52:58 2020] Free swap = 4019440kB [Sat Apr 11 17:52:58 2020] Total swap = 4038652kB [Sat Apr 11 17:52:58 2020] 4194200 pages RAM [Sat Apr 11 17:52:58 2020] 0 pages HighMem/MovableOnly [Sat Apr 11 17:52:58 2020] 94820 pages reserved [Sat Apr 11 17:52:58 2020] 0 pages cma reserved [Sat Apr 11 17:52:58 2020] 0 pages hwpoisoned [Sat Apr 11 17:57:01 2020] warn_alloc: 50 callbacks suppressed On Sat, Apr 11, 2020 at 3:20 PM Jesper Krogh <jesper.krogh@xxxxxxxxx> wrote: > > Ok, i'll change the mount options and report back in a few days on > that topic. This is fairly reproducible, so I would expect to see the > effect if it works. > > On Fri, Apr 10, 2020 at 6:32 PM Tony Lill <ajlill@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On 4/10/20 3:13 AM, Jesper Krogh wrote: > > > Hi. What is the suggested change? - is it Ceph that has an rsize,wsize > > > of 64MB ? > > > > > > > Sorry, set rsize and wsize in the mount options for your cephfs to > > something smaller. > > > > My problem with this is that I use autofs to mount my filesystem. > > Starting with 4.14.82, after a few mount/unmount cycles, the mount would > > fail with order 4 allocation error, and I'd have to reboot. > > > > I traced it to a change that doubled CEPH_MSG_MAX_DATA_LEN from 16M to > > 32M. Later in the 5 series kernels, this was doubled again, and that > > caused an order 5 allocation failure. This define is used to set the max > > and default rsize and wsize. > > > > Reducing the rsize and wsize in the mount option fixed the problem for > > me. This may do nothing for you, but, if it clears your allocation issue... > > > > > > > On Fri, Apr 10, 2020 at 12:47 AM Tony Lill <ajlill@xxxxxxxxxxxxxxxxxxx> wrote: > > >> > > >> > > >> > > >> On 4/9/20 12:30 PM, Jeff Layton wrote: > > >>> On Thu, 2020-04-09 at 18:00 +0200, Jesper Krogh wrote: > > >>>> Thanks Jeff - I'll try that. > > >>>> > > >>>> I would just add to the case that this is a problem we have had on a > > >>>> physical machine - but too many "other" workloads at the same time - > > >>>> so we isolated it off to a VM - assuming that it was the mixed > > >>>> workload situation that did cause us issues. I cannot be sure that it > > >>>> is "excactly" the same problem we're seeing but symptoms are > > >>>> identical. > > >>>> > > >>> > > >>> Do you see the "page allocation failure" warnings on bare metal hosts > > >>> too? If so, then maybe we're dealing with a problem that isn't > > >>> virtio_net specific. In any case, let's get some folks more familiar > > >>> with that area involved first and take it from there. > > >>> > > >>> Feel free to cc me on the bug report too. > > >>> > > >>> Thanks, > > >>> > > >> > > >> In 5.4.20, the default rsize and wsize is 64M. This has caused me page > > >> allocation failures in a different context. Try setting it to something > > >> sensible. > > >> -- > > >> Tony Lill, OCT, ajlill@xxxxxxxxxxxxxxxxxxx > > >> President, A. J. Lill Consultants (519) 650 0660 > > >> 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 > > >> -------------- http://www.ajlc.waterloo.on.ca/ --------------- > > >> > > >> > > >> > > > > -- > > Tony Lill, OCT, ajlill@xxxxxxxxxxxxxxxxxxx > > President, A. J. Lill Consultants (519) 650 0660 > > 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 > > -------------- http://www.ajlc.waterloo.on.ca/ --------------- > > > > > >