Re: Requests blocked as cluster is unaware of dead OSDs for quite a long time

Wido den Hollander <wido@xxxxxxxx> · Tue, 27 Mar 2018 09:17:48 +0200

On 03/27/2018 12:58 AM, Jared H wrote:
> I have three datacenters with three storage hosts in each, which house
> one OSD/MON per host. There are three replicas, one in each datacenter.
> I want the cluster to be able to survive a nuke dropped on 1/3
> datacenters, scaling up to 2/5 datacenters. I do not need realtime data
> replication (Ceph is already fast enough), but I do need decently
> realtime fault tolerance such that requests are blocked for ideally less
> than 10 seconds.
> 
> In testing, I kill networking on 3 hosts and the cluster becomes
> unresponsive for 1-5 minutes as requests are blocked. The monitors are
> detected as down within 15-20 seconds, but OSD take a long time to
> change state to 'down'.
> > I have played with these timeout and heartbeat options but they don't
> seem to have any effect:
> [osd]
> osd_heartbeat=3
> osd_heartbeat_grace=9
> osd_mon_heartbeat_interval=3
> osd_mon_report_interval_min=3
> osd_mon_report_interval_max=9
> osd_mon_ack_timeout=9
> 
> Is it the nature of the networking failure? I can pkill ceph-osd to
> simulate a software failure and they are detected as down almost instantly.
> 

when you kill the OSD the other OSDs will get a 'connection refused' and
can declare the OSD down immediately. But when you kill the network
things start to timeout.

It's hard to judge from the outside what exactly happens, but keep in
mind, Ceph is designed with data consistency as the number 1 priority.
It will choose safety of data over availability. So if it's not sure
what is happening I/O will block.

Wido

> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com