Ceph OOM Killer Luminous

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We have a luminous cluster which was upgraded from Hammer --> Jewel --> Luminous 12.2.8 recently. Post upgrade we are seeing issue with a few nodes where they are running out of memory and dying. In the logs we are seeing OOM killer. We don't have this issue before upgrade. The only difference is the nodes without any issue are R730xd and the ones with the memory leak are R740xd. The hardware vendor don't see anything wrong with the hardware. From Ceph end we are not seeing any issue when it comes to running the cluster, only issue is with memory leak. Right now we are actively rebooting the nodes in timely manner to avoid crashes. One R740xd node we set all the OSDs to 0.0 and there is no memory leak there. Any pointers to fix the issue would be helpful.

Thanks,
Pardhiv Karri



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux