Assistance Required: Ceph OSD Out of Memory (OOM) Issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Ceph Community,

I hope this message finds you well.

I am encountering an out-of-memory (OOM) issue with one of my Ceph OSDs,
which is repeatedly getting killed by the OOM killer on my system. Below
are the relevant details from the log:

*OOM Log*:
[Wed Oct 30 13:14:48 2024]
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/system-ceph\x2dosd.slice,task=ceph-osd,pid=6213,uid=64045
[Wed Oct 30 13:14:48 2024] Out of memory: Killed process 6213 (ceph-osd)
total-vm:216486528kB, anon-rss:211821164kB, file-rss:0kB, shmem-rss:0kB,
UID:64045 pgtables:418836kB oom_score_adj:0
[Wed Oct 30 13:14:58 2024] oom_reaper: reaped process 6213 (ceph-osd), now
anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

*Ceph OSD Log*:

2024-10-30T13:15:30.207+0600 7f906c74dd80  0 _get_class not permitted to
load lua
2024-10-30T13:15:30.211+0600 7f906c74dd80  0 <cls>
/build/ceph-15.2.17/src/cls/hello/cls_hello.cc:312: loading cls_hello
2024-10-30T13:15:30.215+0600 7f906c74dd80  0 _get_class not permitted to
load kvs
2024-10-30T13:15:30.219+0600 7f906c74dd80  0 _get_class not permitted to
load queue
2024-10-30T13:15:30.223+0600 7f906c74dd80  0 <cls>
/build/ceph-15.2.17/src/cls/cephfs/cls_cephfs.cc:198: loading cephfs
2024-10-30T13:15:30.223+0600 7f906c74dd80  0 osd.13 299547 crush map has
features 432629239337189376, adjusting msgr requires for clients
2024-10-30T13:15:30.223+0600 7f906c74dd80  0 osd.13 299547 crush map has
features 432629239337189376 was 8705, adjusting msgr requires for mons
2024-10-30T13:15:30.223+0600 7f906c74dd80  0 osd.13 299547 crush map has
features 3314933000854323200, adjusting msgr requires for osds
2024-10-30T13:15:30.223+0600 7f906c74dd80  1 osd.13 299547
check_osdmap_features require_osd_release unknown -> octopus
2024-10-30T13:15:31.023+0600 7f906c74dd80  0 osd.13 299547 load_pgs
*Environment Details*:

   - Ceph Version: 15.2.17 (Octopus)
   - OSD: osd.13
   - Kernel: Linux kernel version

It seems that the OSD process is consuming a substantial amount of
memory (total-vm:
216486528kB, anon-rss: 211821164kB), leading to OOM kills on the node. The
OSD service restarts but continues to showing consumption excessive memory
and OSD get down.

Could you please provide guidance or suggestions on how to mitigate this
issue? Are there any known memory management settings, configuration
adjustments, or OSD-specific tuning parameters that could help prevent this
from recurring?

Any help would be greatly appreciated.

Thank you for your time and assistance!



Regards
Mosharaf Hossain
Manager, Product Development
Bangladesh Online (BOL)

Level 8, SAM Tower, Plot 4, Road 22, Gulshan 1, Dhaka 1212, Bangladesh
Tel: +880 9609 000 999, +880 2 58815559, Ext: 14191, Fax: +880 2 2222 95757
Cell: +880 1787 680828, Web: www.bol-online.com
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux