Sorry for top post, sent from blackberry. Clock skew is a known issue and the recommendation is for 32-bit. I run ntpdate every 30 minutes to fix scewing, disable ntp daemon, and DO NOT use the vmware tools... DO NOT... On 11/8/08, Le Wen <wenle@xxxxxxxxxx> wrote: > Hi Kenneth > > Try add > clock=pmtmr notsc > to your kernel parameter, it works for me. > > > > > karthikeyan <karthik_arnold1@xxxxxxxxx> > Sent by: redhat-list-bounces@xxxxxxxxxx > 2008-11-08 14:40 > Please respond to > karthik_arnold1@xxxxxxxxx; Please respond to > General Red Hat Linux discussion list <redhat-list@xxxxxxxxxx> > > > To > redhat-list@xxxxxxxxxx > cc > > Subject > NTP problem for virtual RHEL 4 server on VmWare (Kenneth Holter) > > > > > > > > Hi Kenneth > > This is a know issue with rhel on vmware , U can find the knowledege base > about time running slow in vmware website . > > Pls find the link > > http://www.djax.co.uk/kb/linux/vmware_clock_drift.html > > > Rgards > Karthik > > > --- On Fri, 11/7/08, redhat-list-request@xxxxxxxxxx > <redhat-list-request@xxxxxxxxxx> wrote: > >> From: redhat-list-request@xxxxxxxxxx <redhat-list-request@xxxxxxxxxx> >> Subject: redhat-list Digest, Vol 57, Issue 7 >> To: redhat-list@xxxxxxxxxx >> Date: Friday, November 7, 2008, 10:30 PM >> Send redhat-list mailing list submissions to >> redhat-list@xxxxxxxxxx >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://www.redhat.com/mailman/listinfo/redhat-list >> or, via email, send a message with subject or body >> 'help' to >> redhat-list-request@xxxxxxxxxx >> >> You can reach the person managing the list at >> redhat-list-owner@xxxxxxxxxx >> >> When replying, please edit your Subject line so it is more >> specific >> than "Re: Contents of redhat-list digest..." >> >> >> Today's Topics: >> >> 1. RE: Cluster Heart Beat Using Cross Over Cable >> (Karchner, Craig (IT Solutions US)) >> 2. Help Slick Mach make the right choice! >> (mailanky@xxxxxxxxx) >> 3. NTP problem for virtual RHEL 4 server on VmWare >> (Kenneth Holter) >> 4. Cluster Broken pipe & node Reboot (lingu) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Thu, 6 Nov 2008 09:24:12 -0800 >> From: "Karchner, Craig (IT Solutions US)" >> <craig.a.karchner@xxxxxxxxxxx> >> Subject: RE: Cluster Heart Beat Using Cross Over Cable >> To: "General Red Hat Linux discussion list" >> <redhat-list@xxxxxxxxxx> >> Message-ID: >> <13FE6613E1ADA041A0124537010C11E903742020@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> >> >> Content-Type: text/plain; charset="us-ascii" >> >> >> Lingu, >> >> I had this same problem a few weeks back. >> >> This is how I solved it. >> >> Make sure your NIC's are at 1G. >> >> Add the following entries into your cluster.ccs file and >> write it to >> disk; >> >> heartbeat_rate = 30 >> allowed_misses = 4 >> >> My cluster.ccs file looks like this now; >> >> cluster { >> name = "alpha" >> lock_gulm { >> servers = ["server1", "server2", >> "server3"] >> heartbeat_rate = 30 >> allowed_misses = 4 >> } >> } >> >> This example procedure shows how to change configuration >> files in a CCS >> archive. >> >> 1. Extract configuration files from the CCA device into >> temporary >> directory /root/alpha-new/. >> >> ccs_tool extract /dev/pool/alpha_cca /root/alpha-new/ >> >> 2. Make changes to the configuration files in >> /root/alpha-new/. >> >> 3. Create a new CCS archive on the CCA device by using the >> -O (override) >> flag to forcibly overwrite >> the existing CCS archive. >> >> ccs_tool -O create /root/alpha-new/ /dev/pool/alpha_cca >> >> >> >> What you are suggesting ( cross over cable) is not >> supported at least in >> GFS 6.0 which I assume you are running with RHEL 3.0 >> >> >> -----Original Message----- >> From: redhat-list-bounces@xxxxxxxxxx >> [mailto:redhat-list-bounces@xxxxxxxxxx] On Behalf Of lingu >> Sent: Thursday, November 06, 2008 7:41 AM >> To: General Red Hat Linux discussion list >> Subject: Cluster Heart Beat Using Cross Over Cable >> >> Hi, >> >> I am running two node active/passive cluster running >> RHEL3 update >> 8 64 bit OS on Hp Box with external hp storage connected >> via scsi. My >> cluster was running fine for last 3 years.But all of a >> sudden cluster >> service keep on shifting (atleast one time in a day )form >> one node to >> another. >> >> After analysed the syslog i found that due to some >> network >> fluctuation service was getting shifted.Both the nodes has >> two NIC >> bonded together and configured with below ip. >> >> My network details: >> >> 192.168.1.2 --node 1 physical ip with class c subnet >> (bond0 ) >> 192.168.1.3 --node 2 physical ip with class c subnet >> (bond0 ) >> 192.168.1.4 --- floating ip ( cluster ) >> >> Since it is a very critical and busy server may be due to >> heavy >> network load some hear beat signal is getting missed >> resulting in >> shifting of service from one node to another. >> >> So i planned to connect crossover cable for heart beat >> messages, can >> any one guide me or provide me the link that best explains >> how to do >> the same and the changes i have to made in cluster >> configuration file >> after connecting the crossover cable. >> >> Regards, >> >> Lingu >> >> -- >> redhat-list mailing list >> unsubscribe >> mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe >> https://www.redhat.com/mailman/listinfo/redhat-list >> >> >> >> ------------------------------ >> >> Message: 2 >> Date: Fri, 7 Nov 2008 13:38:09 +0530 >> From: <mailanky@xxxxxxxxx> >> Subject: Help Slick Mach make the right choice! >> To: <redhat-list@xxxxxxxxxx> >> Message-ID: >> <B07101D09C1F45E0A8C6C191BEA2A154@webchutney2> >> Content-Type: text/plain; charset="iso-8859-1" >> >> Hey, >> >> ankur has signed you up for a perfect shave! >> Simply help Slick Mach make the right choice & you >> could win a free >> Gillette Mach 3 razor. >> Click here to take the challenge. >> <http://www.slickmach.com/index.html> >> >> >> ------------------------------ >> >> Message: 3 >> Date: Fri, 7 Nov 2008 10:49:04 +0100 >> From: "Kenneth Holter" >> <kenneho.ndu@xxxxxxxxx> >> Subject: NTP problem for virtual RHEL 4 server on VmWare >> To: redhat-list@xxxxxxxxxx >> Message-ID: >> <c25f25140811070149u2d098492rf2c36e6b07941225@xxxxxxxxxxxxxx> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Hei. >> >> >> One of our RHEL 4 servers running on VmWare has a quite >> serious NTP problem. >> I know that NTP can be an issue when running red hat boxes >> on VmWare, so as >> a fix I put this small script in a file in >> /etc/cron.hourly: >> >> >> [root@server cron.hourly]# cat ntpdate >> #!/bin/sh >> /etc/init.d/ntpd stop >> ntpdate 1.2.3.4 >> /tmp/time_adjust.log >> /etc/init.d/ntp >> >> >> After investigating the "/tmp/time_adjust.log" >> file, I was quite surprised >> by the amount of drift found on one particular server. >> Consider this extract >> from the file: >> >> 6 Nov 20:00:01 ntpdate[19373]: step time server 1.2.3.4 >> offset -60.504153 >> sec >> 6 Nov 20:00:52 ntpdate[19666]: step time server 1.2.3.4 >> offset -8.735440 >> sec >> 6 Nov 20:01:00 ntpdate[19689]: step time server 1.2.3.4 >> offset -1.635632 >> sec >> 6 Nov 20:54:06 ntpdate[24198]: step time server 1.2.3.4 >> offset -415.894712 >> sec >> 6 Nov 21:01:01 ntpdate[24920]: adjust time server 1.2.3.4 >> offset 0.136833 >> sec >> 6 Nov 22:01:02 ntpdate[29943]: adjust time server 1.2.3.4 >> offset -0.114253 >> sec >> 6 Nov 23:01:01 ntpdate[2519]: adjust time server 1.2.3.4 >> offset -0.036345 >> sec >> 7 Nov 00:01:00 ntpdate[7577]: step time server 1.2.3.4 >> offset -1.064935 sec >> 7 Nov 01:00:57 ntpdate[12697]: step time server 1.2.3.4 >> offset -3.922577 >> sec >> 7 Nov 02:00:21 ntpdate[17733]: step time server 1.2.3.4 >> offset -40.421825 >> sec >> 7 Nov 02:01:00 ntpdate[17777]: step time server 1.2.3.4 >> offset -1.123175 >> sec >> 7 Nov 02:57:23 ntpdate[22542]: step time server 1.2.3.4 >> offset -218.649820 >> sec >> 7 Nov 03:00:36 ntpdate[22900]: step time server 1.2.3.4 >> offset -25.284528 >> sec >> 7 Nov 03:00:58 ntpdate[22940]: step time server 1.2.3.4 >> offset -3.104130 >> sec >> 7 Nov 03:52:32 ntpdate[27430]: step time server 1.2.3.4 >> offset -509.363952 >> sec >> 7 Nov 03:59:50 ntpdate[27943]: step time server 1.2.3.4 >> offset -71.430354 >> sec >> 7 Nov 04:00:52 ntpdate[28236]: step time server 1.2.3.4 >> offset -9.344907 >> sec >> 7 Nov 04:01:00 ntpdate[28259]: step time server 1.2.3.4 >> offset -1.237651 >> sec >> 7 Nov 05:01:01 ntpdate[1363]: adjust time server 1.2.3.4 >> offset 0.390149 >> sec >> 7 Nov 06:01:01 ntpdate[6419]: adjust time server 1.2.3.4 >> offset -0.185112 >> sec >> 7 Nov 07:01:02 ntpdate[11493]: adjust time server 1.2.3.4 >> offset -0.228884 >> sec >> 7 Nov 08:00:59 ntpdate[16579]: step time server 1.2.3.4 >> offset -2.166519 >> sec >> 7 Nov 09:00:38 ntpdate[21522]: step time server 1.2.3.4 >> offset -23.169420 >> sec >> 7 Nov 09:01:02 ntpdate[21558]: adjust time server 1.2.3.4 >> offset -0.492106 >> sec >> 7 Nov 09:59:26 ntpdate[26329]: step time server 1.2.3.4 >> offset -95.154264 >> sec >> 7 Nov 10:00:55 ntpdate[26639]: step time server 1.2.3.4 >> offset -5.997955 >> sec >> 7 Nov 10:01:01 ntpdate[26658]: step time server 1.2.3.4 >> offset -0.506367 >> sec >> >> >> Does anyone know what may be causing the RHEL box to drift >> as much as 500 >> seconds in only one hour? >> >> Regards, >> Kenneth Holter >> >> >> ------------------------------ >> >> Message: 4 >> Date: Fri, 7 Nov 2008 16:15:08 +0530 >> From: lingu <hicheerup@xxxxxxxxx> >> Subject: Cluster Broken pipe & node Reboot >> To: "General Red Hat Linux discussion list" >> <redhat-list@xxxxxxxxxx> >> Message-ID: >> <29e045b80811070245t1c303530xbf58626227638260@xxxxxxxxxxxxxx> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Hi all, >> >> I am running two node RHEL3U8 cluster of below cluster >> version on >> HP servers connected via scsi channel to HP Storage (SAN) >> for oracle >> database server. >> >> Kernel & Cluster Version >> >> Kernel-2.4.21-47.EL #1 SMP >> redhat-config-cluster-1.0.7-1-noarch >> clumanager-1.2.26.1-1-x86_64 >> >> >> Suddenly my active node got rebooted after analysed the >> logs it is >> throwing below errors on syslog.I want to know what might >> cause this >> type of error and also after analysed the sar output >> indicates there >> was no load on the server at the time system get rebooted >> as well as >> on the time i am getting I/O Hang error. >> >> Nov 3 14:23:00 cluster1 clulockd[1996]: <warning> >> Denied 20.1.2.162: >> Broken pipe >> Nov 3 14:23:00 cluster1 clulockd[1996]: <err> select >> error: Broken pipe >> Nov 3 14:23:06 cluster1 clulockd[1996]: <warning> >> Denied 20.1.2.162: >> Broken pipe >> Nov 3 14:23:06 cluster1 clulockd[1996]: <err> select >> error: Broken pipe >> Nov 3 14:23:13 cluster1 cluquorumd[1921]: <warning> >> Disk-TB: Detected >> I/O Hang! >> Nov 3 14:23:15 cluster1 clulockd[1996]: <warning> >> Denied 20.1.2.161: >> Broken pipe >> Nov 3 14:23:15 cluster1 clulockd[1996]: <err> select >> error: Broken pipe >> Nov 3 14:23:12 cluster1 clusvcmgrd[2011]: <err> >> Unable to obtain >> cluster lock: Connection timed out >> >> Nov 5 17:18:00 cluster1 cluquorumd[1921]: <warning> >> Disk-TB: Detected >> I/O Hang! >> Nov 5 17:18:00 cluster1 clulockd[1996]: <warning> >> Denied 20.1.2.162: >> Broken pipe >> Nov 5 17:18:00 cluster1 clulockd[1996]: <err> select >> error: Broken pipe >> Nov 5 17:18:17 cluster1 clulockd[1996]: <warning> >> Denied 20.1.2.162: >> Broken pipe >> Nov 5 17:18:17 cluster1 clulockd[1996]: <err> select >> error: Broken pipe >> Nov 5 17:18:17 cluster1 clulockd[1996]: <warning> >> Potential recursive >> lock #0 grant to member >> #1, PID1962 >> >> >> I need some one help in guiding how to fix out this error >> and also >> the real cause for such above errors . >> >> Attached my cluster.xml file. >> >> >> >> <?xml version="1.0"?> >> <cluconfig version="3.0"> >> <clumembd broadcast="yes" >> interval="1000000" loglevel="5" >> multicast="no" multicast_ipaddress="" >> thread="yes" tko_count="25"/> >> <cluquorumd loglevel="7" >> pinginterval="5" tiebreaker_ip=""/> >> <clurmtabd loglevel="7" >> pollinterval="4"/> >> <clusvcmgrd loglevel="7"/> >> <clulockd loglevel="7"/> >> <cluster config_viewnumber="4" >> key="6672bc0a71be2ec9486f6a2f5846c172" >> name="ORACLECLUSTER"/> >> <sharedstate driver="libsharedraw.so" >> rawprimary="/dev/raw/raw1" >> rawshadow="/dev/raw/raw2" >> type="raw"/> >> <members> >> <member id="0" name="cluster1" >> watchdog="yes"/> >> <member id="1" name="cluster2" >> watchdog="yes"/> >> </members> >> <services> >> <service checkinterval="10" >> failoverdomain="oracle_db" id="0" >> maxfalsestarts="0" maxrestarts="0" >> name="database" >> userscript="/etc/init.d/script_db.sh"> >> <service_ipaddresses> >> <service_ipaddress broadcast="None" >> id="0" >> ipaddress="20.1.2.35" monitor_link="1" >> netmask="255.255.0.0"/> >> </service_ipaddresses> >> <device id="0" >> name="/dev/cciss/c0d0p1" >> sharename=""> >> <mount forceunmount="yes" >> fstype="ext3" mountpoint="/vol1" >> options="rw"/> >> </device> >> <device id="1" >> name="/dev/cciss/c0d0p2" >> sharename=""> >> <mount forceunmount="yes" >> fstype="ext3" mountpoint="/vol2" >> options="rw"/> >> </device> >> <device id="2" >> name="/dev/cciss/c0d0p5" >> sharename=""> >> <mount forceunmount="yes" >> fstype="ext3" mountpoint="/vol3" >> options="rw"/> >> </device> >> >> </service> >> </services> >> <failoverdomains> >> <failoverdomain id="0" >> name="oracle_db" ordered="no" >> restricted="yes"> >> <failoverdomainnode id="0" >> name="cluster1"/> >> <failoverdomainnode id="1" >> name="cluster2"/> >> </failoverdomain> >> </failoverdomains> >> </cluconfig> >> >> Regards, >> Lingu >> >> >> >> ------------------------------ >> >> __ >> redhat-list mailing list >> Unsubscribe >> mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe >> https://www.redhat.com/mailman/listinfo/redhat-list >> >> End of redhat-list Digest, Vol 57, Issue 7 >> ****************************************** > > > > > -- > redhat-list mailing list > unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe > https://www.redhat.com/mailman/listinfo/redhat-list > > -- > redhat-list mailing list > unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe > https://www.redhat.com/mailman/listinfo/redhat-list > -- Sent from my mobile device -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list