Konstantin Shalygin wrote: : >how do you deal with the "clock skew detected" HEALTH_WARN message? : > : >I think the internal RTC in most x86 servers does have 1 second resolution : >only, but Ceph skew limit is much smaller than that. So every time I reboot : >one of my mons (for kernel upgrade or something), I have to wait for several : >minutes for the system clock to synchronize over NTP, even though ntpd : >has been running before reboot and was started during the system boot again. : : Definitely you should use chrony with iburst. OK, many responses (thanks for them!) suggest chrony, so I tried it: With all three mons running chrony and being in sync with my NTP server with offsets under 0.0001 second, I rebooted one of the mons: There still was the HEALTH_WARN clock_skew message as soon as the rebooted mon starts responding to ping. The cluster returns to HEALTH_OK about 95 seconds later. According to "ntpdate -q my.ntp.server", the initial offset after reboot is about 0.6 s (which is the reason of HEALTH_WARN, I think), but it gets under 0.0001 s in about 25 seconds. The remaining ~50 seconds of HEALTH_WARN is inside Ceph, with mons being already synchronized. So the result is that chrony indeed synchronizes faster, but nevertheless I still have about 95 seconds of HEALTH_WARN "clock skew detected". I guess now the workaround now is to ignore the warning, and wait for two minutes before rebooting another mon. -Yenya -- | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | sir_clive> I hope you don't mind if I steal some of your ideas? laryross> As far as stealing... we call it sharing here. --from rcgroups _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com