Re: How do you deal with "clock skew detected"?

Stefan Kooman <stefan@xxxxxx> · Thu, 16 May 2019 17:38:04 +0200

Quoting Jan Kasprzak (kas@xxxxxxxxxx):

> 	OK, many responses (thanks for them!) suggest chrony, so I tried it:
> With all three mons running chrony and being in sync with my NTP server
> with offsets under 0.0001 second, I rebooted one of the mons:
> 
> 	There still was the HEALTH_WARN clock_skew message as soon as
> the rebooted mon starts responding to ping. The cluster returns to
> HEALTH_OK about 95 seconds later.
> 
> 	According to "ntpdate -q my.ntp.server", the initial offset
> after reboot is about 0.6 s (which is the reason of HEALTH_WARN, I think),
> but it gets under 0.0001 s in about 25 seconds. The remaining ~50 seconds
> of HEALTH_WARN is inside Ceph, with mons being already synchronized.
> 
> 	So the result is that chrony indeed synchronizes faster,
> but nevertheless I still have about 95 seconds of HEALTH_WARN "clock skew
> detected".
> 
> 	I guess now the workaround now is to ignore the warning, and wait
> for two minutes before rebooting another mon.

You can tune the "mon_timecheck_skew_interval" which by default is set
to 30 seconds. See [1] and look for "timecheck" to find the different
options.

Gr. Stefan

[1]:
http://docs.ceph.com/docs/master/rados/configuration/mon-config-ref/

-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com