Re: How do you deal with "clock skew detected"?

Uwe Sauter <uwe.sauter.de@xxxxxxxxx> · Thu, 16 May 2019 19:05:42 +0200

You could also edit your ceph-mon@.service (assuming systemd) to depend on chrony and add a line 
"ExecStartPre=/usr/bin/sleep 30" to stall the startup to give chrony a chance to sync before the Mon is started.

Am 16.05.19 um 17:38 schrieb Stefan Kooman:
Quoting Jan Kasprzak (kas@xxxxxxxxxx):

	OK, many responses (thanks for them!) suggest chrony, so I tried it:
With all three mons running chrony and being in sync with my NTP server
with offsets under 0.0001 second, I rebooted one of the mons:

	There still was the HEALTH_WARN clock_skew message as soon as
the rebooted mon starts responding to ping. The cluster returns to
HEALTH_OK about 95 seconds later.

	According to "ntpdate -q my.ntp.server", the initial offset
after reboot is about 0.6 s (which is the reason of HEALTH_WARN, I think),
but it gets under 0.0001 s in about 25 seconds. The remaining ~50 seconds
of HEALTH_WARN is inside Ceph, with mons being already synchronized.

	So the result is that chrony indeed synchronizes faster,
but nevertheless I still have about 95 seconds of HEALTH_WARN "clock skew
detected".

	I guess now the workaround now is to ignore the warning, and wait
for two minutes before rebooting another mon.

You can tune the "mon_timecheck_skew_interval" which by default is set
to 30 seconds. See [1] and look for "timecheck" to find the different
options.

Gr. Stefan

[1]:
http://docs.ceph.com/docs/master/rados/configuration/mon-config-ref/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com