Re: OSD repeatedly marked down

Jan Kasprzak <kas@xxxxxxxxxx> · Wed, 1 Dec 2021 22:17:04 +0100

	Sebastian,

Sebastian Knust wrote:
: On 01.12.21 17:31, Jan Kasprzak wrote:
: >In "ceph -s", they "2 osds down"
: >message disappears, and the number of degraded objects steadily decreases.
: >However, after some time the number of degraded objects starts going up
: >and down again, and osds appear to be down (and then up again). After 5 minutes
: >the OSDs are kicked out from the cluster, and the ceph-osd daemons stop
: >Dec 01 17:18:07 my.osd.host ceph-osd[3818]: 2021-12-01T17:18:07.626+0100 7f8c38e02700 -1 received  signal: Interrupt from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0
: >Dec 01 17:18:07 my.osd.host ceph-osd[3818]: 2021-12-01T17:18:07.626+0100 7f8c38e02700 -1 osd.32 1119559 *** Got signal Interrupt ***
: >Dec 01 17:18:07 my.osd.host ceph-osd[3818]: 2021-12-01T17:18:07.626+0100 7f8c38e02700 -1 osd.32 1119559 *** Immediate shutdown (osd_fast_shutdown=true) ***
: >
: 
: Do you have enough memory on your host? You might want to look for
: oom messages in dmesg / journal and monitor your memory usage
: throughout the recovery.

	Yes, I have lots of memory. This particular node has 512 GB,
and according to top(1), the ceph-osd daemon has VSZ around 1.1 GB.
OOM would be visible in dmesg(8) (it is not). AFAIK, CentOS 8 Stream
does not have systemd-oomd(8) yet.

: If the osd processes are indeed killed by OOM killer, you have a few
: options. Adding more memory would probably be best to future-proof
: the system. Maybe you could also work with some Ceph config setting,
: e.g. lowering osd_max_backfills (although I'm definitely not an
: expert on which parameters would give you the best result). Adding
: swap will most likely only produce other issues, but might be a
: method of last resort.

	I tend to add a small swap partition to my systems (this one
has 8 GB of swap) just to get rid of initialization code in various
processes. But after starting ceph-osd daemons (and them being killed
exactly after 600.0 seconds), there are exactly zero bytes of swap space used.

	So I don't think my problem is OOM. It might be communication,
but I tried to tcpdump and look for example for ICMP port unreachable
messages, but nothing interesting there.

-Yenya

-- 
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| http://www.fi.muni.cz/~kas/                         GPG: 4096R/A45477D5 |
    We all agree on the necessity of compromise. We just can't agree on
    when it's necessary to compromise.                     --Larry Wall
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx