Quoting Jan Kasprzak (kas@xxxxxxxxxx): > OK, many responses (thanks for them!) suggest chrony, so I tried it: > With all three mons running chrony and being in sync with my NTP server > with offsets under 0.0001 second, I rebooted one of the mons: > > There still was the HEALTH_WARN clock_skew message as soon as > the rebooted mon starts responding to ping. The cluster returns to > HEALTH_OK about 95 seconds later. > > According to "ntpdate -q my.ntp.server", the initial offset > after reboot is about 0.6 s (which is the reason of HEALTH_WARN, I think), > but it gets under 0.0001 s in about 25 seconds. The remaining ~50 seconds > of HEALTH_WARN is inside Ceph, with mons being already synchronized. > > So the result is that chrony indeed synchronizes faster, > but nevertheless I still have about 95 seconds of HEALTH_WARN "clock skew > detected". > > I guess now the workaround now is to ignore the warning, and wait > for two minutes before rebooting another mon. You can tune the "mon_timecheck_skew_interval" which by default is set to 30 seconds. See [1] and look for "timecheck" to find the different options. Gr. Stefan [1]: http://docs.ceph.com/docs/master/rados/configuration/mon-config-ref/ -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com