Re: OSD's still UP after power loss

Eugen Block <eblock@xxxxxx> · Fri, 21 May 2021 14:35:04 +0000

Is there any better solution?

Yes, add more nodes. ;-)
Having only two OSD nodes is not the best idea, it's kind of a corner  
case and I've observed some weird behaviour with corner cases in the  
past, not even mentioning the 2 replicas. Is this a test environment?

Zitat von by morphin <morphinwithyou@xxxxxxxxx>:

I've figured out but I'm scared from the result.
The solution is "mon_osd_min_down_reporters = 1"
Due to "two node" cluster and "replicated 2" with "chooseleaf host"
the reporter count should be set to 1 but on a malfunction this could
be a serious problem.

Is there any better solution?

by morphin <morphinwithyou@xxxxxxxxx>, 20 May 2021 Per, 22:04
tarihinde şunu yazdı:

Hello

I have a weird problem on 3 node cluster. "Nautilus 14.2.9"
When I try power failure OSD's are not marking as DOWN and MDS do not
respond anymore.
If I manually set osd down then MDS becomes active again.

BTW: Only 2 node has OSD's. Third node is only for MON.

I've set mon_osd_down_out_interval = 0.3 in ceph.conf at global
section and restart all MON's but when I check it with "ceph daemon
mon.ID config show" I see mon_osd_down_out_interval: "600".  I didn't
get it why its still "600" and honestly I don't know even it has any
effect on my problem.

Where should I check?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx