Re: Date drift and ntpd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Jason Pyeron wrote, On 08/12/2010 08:01 AM:
>  
> 
>> -----Original Message-----
>> From: centos-bounces@xxxxxxxxxx 
>> [mailto:centos-bounces@xxxxxxxxxx] On Behalf Of Simon Billis
>> Sent: Thursday, August 12, 2010 7:36
>> To: 'CentOS mailing list'
>> Subject: Re:  Date drift and ntpd
>>
>> Jason Pyeron sent a missive on 2010-08-12:
>>
>>> We have a local time server and all of our machines are 
>> pointed at it 
>>> for the time.
>>>
>>> How can the clock drift by a day and a half?
>>>
>>> [root@devserver21 ~]# date
>>> Fri Aug 13 14:43:29 EDT 2010
>>> [root@devserver21 ~]# rdate -s 192.168.1.67
>>> [root@devserver21 ~]# date
>>> Thu Aug 12 07:02:39 EDT 2010
>>> [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ 
>>> restrict default nomodify notrap noquery restrict 127.0.0.1 server
>>> 192.168.1.67 server 192.168.1.66 server 192.168.1.65
>>> server  127.127.1.0     # local clock
>>> fudge   127.127.1.0 stratum 10
>>> driftfile /var/lib/ntp/drift
>>> broadcastdelay  0.008
>>> keys            /etc/ntp/keys
>>>
>>>
>> Hi,
>>
>> It is unlikely that the machine in question drifted forward 
>> in time if ntpd was running. Have a look at the logs 
>> /var/log/messages it should contain the ntpd log messages 
> 
> [root@devserver21 ~]# grep ntpd /var/log/messages
> </snip>
> Jul 28 20:34:41 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
> Jul 28 21:08:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Jul 28 21:08:00 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM
> Jul 28 21:08:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3
> Jul 28 21:24:58 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
> Jul 28 21:41:26 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3
> Jul 28 21:42:16 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Jul 28 21:42:16 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM
> Jul 28 21:42:34 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM
> Jul 28 21:43:37 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM

> tolerance 500 PPM
> Jul 28 22:12:07 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM
> Jul 28 22:13:13 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM
> Jul 28 22:14:17 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM
> Jul 28 22:15:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3
> Jul 28 22:31:41 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Jul 28 22:31:41 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
> tolerance 500 PPM

> Jul 29 15:14:01 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Jul 29 15:26:05 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
> Jul 29 15:59:17 devserver21 ntpd[3475]: time reset -1.599691 s
> Jul 29 16:03:31 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Jul 29 16:05:38 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3
> Jul 29 16:08:46 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3
> Jul 29 16:11:55 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3

> Jul 29 17:23:57 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3
> Jul 29 17:24:59 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Jul 29 17:30:46 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
> Jul 29 17:47:24 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10
> Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not
> permitted
> [root@devserver21 ~]# uptime
>  08:10:19 up 164 days,  9:56,  2 users,  load average: 0.20, 0.54, 0.81
> [root@devserver21 ~]#

Assumption: this is not from any kind of virtual machine.
Assumption: Your local time server is NOT a GPS with an ovenized crystal or even a cell phone time
source, i.e. NOT very stable.
Assumption: the time servers that you are following (192.168.1.6[57]) are:
	a) each following the same timeserver(s), or at least have one in common.
	b) peering with one another
	c) following time servers that are reasonably stable.
Assumption: the time farm is on real, non busy (an old cisco router serving as the internet
connection to 1000+ computers does not qualify as non busy), hardware and is configured to archive
maxpoll 10 or higher.

one problem that you have is that your timeserver farm (192.168.1.6[57]) is occasionally loosing its
servers, i.e. we see "synchronized to LOCAL(0)" occasionally, which should not happen with a well
configured time farm for hours to days, not minutes.

the second problem is that a machine which is not intended to be a time server is configured with a
local clock with a stratum better than 15.

suggestion 1: 65 should have local clock at stratum 13, 66 and 67 should have local clock at stratum
14 or 15, all other machines should not have a local clock or should not have one with a stratum
better than 15. Yes I, after reading the ntp documentation, disagree with RedHat's default.
net result should be that you don't get any local clock loops in the setup because you have a
defined leader, but if even the defined leader is lost the other machines should do a stable drift.

suggestion 2: 65, 66 & 67 should ALL peer with one another for added stability in the time farm.

suggestion 3: client machines should 'prefer' one of your servers over the others.

suggestion 4: see if someone has been messing with the kernel ticks on the machine...
run `tickadj` file:///usr/share/doc/ntp-4.2.2p1/tickadj.html
I had one computer where I needed to tweak the default value up or down one (I don't remember) to
have it be real stable, this should be a last resort.


-- 
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos


[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux