Re: HA and data recovery of CEPH

Peng Bo <pengbo@xxxxxxxxxxx> · Thu, 12 Dec 2019 11:29:45 +0800

Thanks to all, now we can make that duration to 25 seconds around, this is the best result as we can.
BR

On Tue, Dec 3, 2019 at 10:30 PM Wido den Hollander <wido@xxxxxxxx> wrote:

On 12/3/19 3:07 PM, Aleksey Gutikov wrote:

> 

>> That is true. When an OSD goes down it will take a few seconds for it's

>> Placement Groups to re-peer with the other OSDs. During that period

>> writes to those PGs will stall for a couple of seconds.

>>

>> I wouldn't say it's 40s, but it can take ~10s.

> 

> Hello,

> 

> According to my experience, in case of OSD crashes, killed -9 (any kind 

> abnormat termination) OSD failure handling contains next steps:

> 1) Failed OSD's peers detect that it does not respond - it can take up 

> to osd_heartbeat_grace + osd_heartbeat_interval seconds

If a 'Connection Refused' is detected the OSD will be marked as down 

right away.

> 2) Peers send reports to monitor

> 3) Monitor makes a decision according to (options from it's own config) 

> mon_osd_adjust_heartbeat_grace, osd_heartbeat_grace, 

> mon_osd_laggy_halflife, mon_osd_min_down_reporters, ... And finally mark 

> OSD down in osdmap.

True.

> 4) Monitor send updated OSDmap to OSDs and clients

> 5) OSDs starting peering

> 5.1) Peering itself is complicated process, for example we had 

> experienced PGs stuck in inactive state due to 

> osd_max_pg_per_osd_hard_ratio.

I would say that 5.1 isn't relevant for most cases. Yes, it can happen, 

but it's rare.

> 6) Peering finished (PGs' data continue moving) - clients can normally 

> access affected PGs. Clients also have their own timeouts that can 

> affect time to recover. >

> Again, according to my experience, 40s with default settings is possible.

> 

40s is possible in certain scenarios. But I wouldn't say that's the 

default for all cases.

Wido

> 

-- 
The modern Unified Communications provider
https://www.portsip.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com