CEPH monitor slow ops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

we have CEPH cluster with 144 NVMe "disks", "background" network
is RoCE. CEPH cluster is version 18.2.2 and is installed via CEPH
orchestrator cephadm, container daemon is podman. OS is Debian
bookworm, podman is in the version 4.3.1+ds1-8+b1, now we are
installed version 4.3.1+ds1-8+deb12u1 (standard Debian bookworm
package)

We have sometimes problem with the laggy OSD. This Sunday we had
problem with osd.25. It was laggy, some other OSDs reported it to
MON, MON dismissed it from crush map, but osd.25 complaint about
it, MON tried to add it again to the crush map - but this
behavior caused, that whole CEPH cluster was laggy and had slow
ops, which was a "kiss of death" to some ProxMox VMs :-(.

When I restarted osd.25 daemon, after a couple minutes cluster
goes back to HEALTHY state.

I have prepared logs from 3 machines:

c2-dc1-elected-mon-osd.25.txt - machine c2-dc1 had elected MON
and 12 OSDs, same of them complait's about osd.25.

c1-dc1-mon-osd.25.txt - machine with another MON and 12 OSDs

c4-dc2-osd.25.txt - machine with osd.25 daemon (and other 11
OSDs)

Problem was, that the first laggy daemon occured around 6 a.m.,
but problem insisted up to cca 4:20 p.m., when I've restarted
osd.25.

There is a question, if will be better to restart laggy OSD
daemon by ceph orchestrator?

For example:

If OSD daemons from 3 host (now it is from 2 hosts by default)
and after 20s reported some OSD as a dead, MON give a command to
mgr to restart this OSD? Somethink like STONITH in corosync/pcs
cluster ( https://en.wikipedia.org/wiki/STONITH )?

Logs are very big (around 23MB tar.bz2), I've prepared it here:

https://hazard.jcu.cz/osd.25.tar.bz2

We thinking about some problem with network, but on the same
machine, there is another 11 OSDs and they parformed well.

This is not the first occurence of this problem, when it happened
the first time, we tried to reload whole server, but then we
found, that reload of OSD container is enough...

Sincerely
Jan Marek
-- 
Ing. Jan Marek
University of South Bohemia
Academic Computer Centre
Phone: +420389032080
http://www.gnu.org/philosophy/no-word-attachments.cs.html

Attachment: signature.asc
Description: PGP signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux