3 node clusters and a corner case behavior

Victor Rodriguez <vrodriguez@xxxxxxxxxxxxx> · Fri, 3 Mar 2023 14:00:12 +0100

Hello,

Before we start I'm fully aware that this kind of setup is not 
recommended by any means and I'm familiar with it's implications. I'm 
just trying to practice extreme situations, just in case...

I have a test cluster with:

3 nodes with Proxmox 7.3 + Ceph Quincy 17.2.5
3 monitors + 3 managers in server01, server02 and server03
4 OSD, two in server01, two in server02. No OSD in server03. All OSD are 
class "ssd".
1 pool with replica=2, min_replica=1. Crush rule uses just ssd class OSD.

I do wait for ceph status to be fully OK between each test.

A.- If I orderly shutdown server01, it's OSDs get marked down as 
expected. I/O on the pool works correctly before, during and after the 
shutdown.

B.- If I poweroff server01, it's OSDs do not get marked down. I/O on the 
pool does not work at all, neither reads nor writes. A small number of 
slow-ops show in ceph status, something like 7 to 25. After 30 minutes, 
the server01's OSDs get marked down, I/O on the pool gets restored and 
slow-ops disappear.

C.- Now I create an OSD on server03 with class "noClass". This OSD won't 
be used by the pool. If I now poweroff server01, it's OSDs get marked 
down as soon as some I/O is sent to the pool and I/O works correctly.

Looks like I am in this exact situation: 
https://tracker.ceph.com/issues/16910#note-2

Questions:

Why does Ceph behave this way in test B? Shouldn't it simply mark the 
OSDs down like in test A and C?

Which config setting(s) set that 30 minute wait time before marking all 
OSD down?

Many thanks in advance!

--
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx