Re: OSDs do not respect my memory tune limit

Daniel Brunner <daniel@brunner.ninja> · Fri, 2 Dec 2022 09:59:34 +0100



for example on of my latest osd crashes looks like this in dmesg:

[Dec 2 08:26] bstore_mempool invoked oom-killer:
gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), nodemask=0, order=0,
oom_score_adj=0
[  +0.000006] bstore_mempool
cpuset=ed46e6fa52c1e40f13389b349c54e62dcc8c65d76c4c7860e2ff7c39444d14cc
mems_allowed=0
[  +0.000010] CPU: 3 PID: 3061712 Comm: bstore_mempool Tainted: G        W
      4.9.312-7 #1
[  +0.000002] Hardware name: Hardkernel ODROID-HC4 (DT)
[  +0.000001] Call trace:
[  +0.000011] [<ffffff800908cce0>] dump_backtrace+0x0/0x230
[  +0.000004] [<ffffff800908cf38>] show_stack+0x28/0x34
[  +0.000005] [<ffffff80094863b8>] dump_stack+0xb0/0xe8
[  +0.000006] [<ffffff8009246378>] dump_header+0x70/0x1d8
[  +0.000005] [<ffffff80091ceaec>] oom_kill_process+0xec/0x490
[  +0.000004] [<ffffff80091cf1e4>] out_of_memory+0x124/0x2e0
[  +0.000003] [<ffffff8009237f98>] mem_cgroup_out_of_memory+0x58/0x80
[  +0.000003] [<ffffff800923e0fc>] mem_cgroup_oom_synchronize+0x35c/0x3d4
[  +0.000003] [<ffffff80091cf3bc>] pagefault_out_of_memory+0x1c/0x80
[  +0.000004] [<ffffff800909f3bc>] do_page_fault+0x38c/0x3b0
[  +0.000003] [<ffffff800909f4b0>] do_translation_fault+0xd0/0xf0
[  +0.000003] [<ffffff8009081338>] do_mem_abort+0x58/0xb0
[  +0.000003] Exception stack(0xffffffc026637df0 to 0xffffffc026637f20)
[  +0.000002] 7de0:                                   0000007f86d50c78
0000000082000007
[  +0.000004] 7e00: ffffffc026637ec0 0000007f86d50c78 ffffffc08ae3b900
ffffffc08ae3b900
[  +0.000002] 7e20: ffffffc026637ec0 00000055851577c8 0000000060000000
00000000000409ff
[  +0.000003] 7e40: 0000000000000000 ffffff80090837c0 ffffffc026637e90
ffffff800908c51c
[  +0.000003] 7e60: ffffffc026637e90 ffffff800908145c 0000007f86d50c78
0000000082000007
[  +0.000003] 7e80: 0000000000000008 ffffffc08ae3b900 0000000000000000
ffffff800908340c
[  +0.000002] 7ea0: 0000000000000000 00000040c4e3f000 ffffffffffffffff
00000040c4e3f000
[  +0.000003] 7ec0: 0000000000000000 0000007f870d8b88 0000000016e4c66f
108890ff61ee90a0
[  +0.000003] 7ee0: 00000055c5db2828 000000000000017f 0000007f870d6000
00000000219d0e6c
[  +0.000003] 7f00: 0000000000000000 003b9aca00000000 000000006389b6e9
0000000016e4c66f
[  +0.000003] [<ffffff8009081470>] do_el0_ia_bp_hardening+0x90/0xa0
[  +0.000001] Exception stack(0xffffffc026637ea0 to 0xffffffc026637fd0)
[  +0.000003] 7ea0: 0000000000000000 00000040c4e3f000 ffffffffffffffff
00000040c4e3f000
[  +0.000003] 7ec0: 0000000000000000 0000007f870d8b88 0000000016e4c66f
108890ff61ee90a0
[  +0.000003] 7ee0: 00000055c5db2828 000000000000017f 0000007f870d6000
00000000219d0e6c
[  +0.000003] 7f00: 0000000000000000 003b9aca00000000 000000006389b6e9
0000000016e4c66f
[  +0.000002] 7f20: 0000000000000018 000000006389b6e9 0016a9ab002471a6
00003b1b6f26b535
[  +0.000003] 7f40: 0000005585ece630 0000007f86d50c78 0000000000000000
0000005605fb2130
[  +0.000003] 7f60: 0000007f768f9ce8 0000000000000000 0000000000000000
000000001003e8f3
[  +0.000003] 7f80: 000000001003df58 000000001626e380 000000003b9aca00
112e0be826d694b3
[  +0.000002] 7fa0: 0004d7bb2ef84792 0000007f768f9b40 00000055859871ac
0000007f768f9b40
[  +0.000002] 7fc0: 0000007f86d50c78 0000000000000000
[  +0.000003] [<ffffff800908340c>] el0_ia+0x18/0x1c
[  +0.000002] Task in
/docker/ed46e6fa52c1e40f13389b349c54e62dcc8c65d76c4c7860e2ff7c39444d14cc
killed as a result of limit of
/docker/ed46e6fa52c1e40f13389b349c54e62dcc8c65d76c4c7860e2ff7c39444d14cc
[  +0.000011] memory: usage 3072000kB, limit 3072000kB, failcnt 4030563
[  +0.000002] memory+swap: usage 4059888kB, limit 6144000kB, failcnt 0
[  +0.000002] kmem: usage 4596kB, limit 9007199254740988kB, failcnt 0
[  +0.000001] Memory cgroup stats for
/docker/ed46e6fa52c1e40f13389b349c54e62dcc8c65d76c4c7860e2ff7c39444d14cc:
cache:1308KB rss:3066096KB rss_huge:4096KB mapped_file:308KB dirty:0KB
writeback:0KB swap:987888KB inactive_anon:613232KB active_anon:2452864KB
inactive_file:864KB active_file:348KB unevictable:0KB
[  +0.000021] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
swapents oom_score_adj name
[  +0.000180] [3061162]     0 3061162      214        0       4       3
   8             0 docker-init
[  +0.000004] [3061175]   167 3061175  1294319   764584    2298       9
248700             0 ceph-osd
[  +0.000015] Memory cgroup out of memory: Kill process 3061175 (ceph-osd)
score 985 or sacrifice child
[  +0.004798] Killed process 3061175 (ceph-osd) total-vm:5177276kB,
anon-rss:3058332kB, file-rss:0kB, shmem-rss:0kB
[  +1.042284] oom_reaper: reaped process 3061175 (ceph-osd), now
anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Am Fr., 2. Dez. 2022 um 09:47 Uhr schrieb Daniel Brunner
<daniel@brunner.ninja>:

> Hi,
>
> my OSDs are running odroid-hc4's and they only have about 4GB of memory,
> and every 10 minutes a random OSD crashes due to out of memory. Sadly the
> whole machine gets unresponsive when the memory gets completely full, so no
> ssh access or prometheus output in the meantime.
>
> After the osd successfully crashed and restarts, and the memory is free
> again, i can look into the machine again.
>
> I've set the memory limit very low on all OSDs:
>
> for i in {0..17} ; do sudo ceph config set osd.$i osd_memory_target
> 939524096 ; done
>
> which is the absolute minimum, about 0.9GB.
>
> Why are the OSDs not respecting this limit? I tried enforcing the memory
> limit with the docker container by appending -m3200M to the docker run
> command, which helps with the unresponsiveness when the memory goes full.
> The linux kernel now kills the ceph-osd process earlier when only few
> memory resources are left.
>
> How can I make the ceph-osd not crash anymore? Decreasing pg_num and
> pgp_num on my only cephfs pool did not work, the number is still high after
> setting to 16.
>
>
> Best regards
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx