You could also edit your ceph-mon@.service (assuming systemd) to depend on chrony and add a line
"ExecStartPre=/usr/bin/sleep 30" to stall the startup to give chrony a chance to sync before the Mon is started.
Am 16.05.19 um 17:38 schrieb Stefan Kooman:
Quoting Jan Kasprzak (kas@xxxxxxxxxx):
OK, many responses (thanks for them!) suggest chrony, so I tried it:
With all three mons running chrony and being in sync with my NTP server
with offsets under 0.0001 second, I rebooted one of the mons:
There still was the HEALTH_WARN clock_skew message as soon as
the rebooted mon starts responding to ping. The cluster returns to
HEALTH_OK about 95 seconds later.
According to "ntpdate -q my.ntp.server", the initial offset
after reboot is about 0.6 s (which is the reason of HEALTH_WARN, I think),
but it gets under 0.0001 s in about 25 seconds. The remaining ~50 seconds
of HEALTH_WARN is inside Ceph, with mons being already synchronized.
So the result is that chrony indeed synchronizes faster,
but nevertheless I still have about 95 seconds of HEALTH_WARN "clock skew
detected".
I guess now the workaround now is to ignore the warning, and wait
for two minutes before rebooting another mon.
You can tune the "mon_timecheck_skew_interval" which by default is set
to 30 seconds. See [1] and look for "timecheck" to find the different
options.
Gr. Stefan
[1]:
http://docs.ceph.com/docs/master/rados/configuration/mon-config-ref/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com