Hi Kenneth This is a know issue with rhel on vmware , U can find the knowledege base about time running slow in vmware website . Pls find the link http://www.djax.co.uk/kb/linux/vmware_clock_drift.html Rgards Karthik --- On Fri, 11/7/08, redhat-list-request@xxxxxxxxxx <redhat-list-request@xxxxxxxxxx> wrote: > From: redhat-list-request@xxxxxxxxxx <redhat-list-request@xxxxxxxxxx> > Subject: redhat-list Digest, Vol 57, Issue 7 > To: redhat-list@xxxxxxxxxx > Date: Friday, November 7, 2008, 10:30 PM > Send redhat-list mailing list submissions to > redhat-list@xxxxxxxxxx > > To subscribe or unsubscribe via the World Wide Web, visit > https://www.redhat.com/mailman/listinfo/redhat-list > or, via email, send a message with subject or body > 'help' to > redhat-list-request@xxxxxxxxxx > > You can reach the person managing the list at > redhat-list-owner@xxxxxxxxxx > > When replying, please edit your Subject line so it is more > specific > than "Re: Contents of redhat-list digest..." > > > Today's Topics: > > 1. RE: Cluster Heart Beat Using Cross Over Cable > (Karchner, Craig (IT Solutions US)) > 2. Help Slick Mach make the right choice! > (mailanky@xxxxxxxxx) > 3. NTP problem for virtual RHEL 4 server on VmWare > (Kenneth Holter) > 4. Cluster Broken pipe & node Reboot (lingu) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 6 Nov 2008 09:24:12 -0800 > From: "Karchner, Craig (IT Solutions US)" > <craig.a.karchner@xxxxxxxxxxx> > Subject: RE: Cluster Heart Beat Using Cross Over Cable > To: "General Red Hat Linux discussion list" > <redhat-list@xxxxxxxxxx> > Message-ID: > <13FE6613E1ADA041A0124537010C11E903742020@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> > > Content-Type: text/plain; charset="us-ascii" > > > Lingu, > > I had this same problem a few weeks back. > > This is how I solved it. > > Make sure your NIC's are at 1G. > > Add the following entries into your cluster.ccs file and > write it to > disk; > > heartbeat_rate = 30 > allowed_misses = 4 > > My cluster.ccs file looks like this now; > > cluster { > name = "alpha" > lock_gulm { > servers = ["server1", "server2", > "server3"] > heartbeat_rate = 30 > allowed_misses = 4 > } > } > > This example procedure shows how to change configuration > files in a CCS > archive. > > 1. Extract configuration files from the CCA device into > temporary > directory /root/alpha-new/. > > ccs_tool extract /dev/pool/alpha_cca /root/alpha-new/ > > 2. Make changes to the configuration files in > /root/alpha-new/. > > 3. Create a new CCS archive on the CCA device by using the > -O (override) > flag to forcibly overwrite > the existing CCS archive. > > ccs_tool -O create /root/alpha-new/ /dev/pool/alpha_cca > > > > What you are suggesting ( cross over cable) is not > supported at least in > GFS 6.0 which I assume you are running with RHEL 3.0 > > > -----Original Message----- > From: redhat-list-bounces@xxxxxxxxxx > [mailto:redhat-list-bounces@xxxxxxxxxx] On Behalf Of lingu > Sent: Thursday, November 06, 2008 7:41 AM > To: General Red Hat Linux discussion list > Subject: Cluster Heart Beat Using Cross Over Cable > > Hi, > > I am running two node active/passive cluster running > RHEL3 update > 8 64 bit OS on Hp Box with external hp storage connected > via scsi. My > cluster was running fine for last 3 years.But all of a > sudden cluster > service keep on shifting (atleast one time in a day )form > one node to > another. > > After analysed the syslog i found that due to some > network > fluctuation service was getting shifted.Both the nodes has > two NIC > bonded together and configured with below ip. > > My network details: > > 192.168.1.2 --node 1 physical ip with class c subnet > (bond0 ) > 192.168.1.3 --node 2 physical ip with class c subnet > (bond0 ) > 192.168.1.4 --- floating ip ( cluster ) > > Since it is a very critical and busy server may be due to > heavy > network load some hear beat signal is getting missed > resulting in > shifting of service from one node to another. > > So i planned to connect crossover cable for heart beat > messages, can > any one guide me or provide me the link that best explains > how to do > the same and the changes i have to made in cluster > configuration file > after connecting the crossover cable. > > Regards, > > Lingu > > -- > redhat-list mailing list > unsubscribe > mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe > https://www.redhat.com/mailman/listinfo/redhat-list > > > > ------------------------------ > > Message: 2 > Date: Fri, 7 Nov 2008 13:38:09 +0530 > From: <mailanky@xxxxxxxxx> > Subject: Help Slick Mach make the right choice! > To: <redhat-list@xxxxxxxxxx> > Message-ID: > <B07101D09C1F45E0A8C6C191BEA2A154@webchutney2> > Content-Type: text/plain; charset="iso-8859-1" > > Hey, > > ankur has signed you up for a perfect shave! > Simply help Slick Mach make the right choice & you > could win a free > Gillette Mach 3 razor. > Click here to take the challenge. > <http://www.slickmach.com/index.html> > > > ------------------------------ > > Message: 3 > Date: Fri, 7 Nov 2008 10:49:04 +0100 > From: "Kenneth Holter" > <kenneho.ndu@xxxxxxxxx> > Subject: NTP problem for virtual RHEL 4 server on VmWare > To: redhat-list@xxxxxxxxxx > Message-ID: > <c25f25140811070149u2d098492rf2c36e6b07941225@xxxxxxxxxxxxxx> > Content-Type: text/plain; charset=ISO-8859-1 > > Hei. > > > One of our RHEL 4 servers running on VmWare has a quite > serious NTP problem. > I know that NTP can be an issue when running red hat boxes > on VmWare, so as > a fix I put this small script in a file in > /etc/cron.hourly: > > > [root@server cron.hourly]# cat ntpdate > #!/bin/sh > /etc/init.d/ntpd stop > ntpdate 1.2.3.4 >> /tmp/time_adjust.log > /etc/init.d/ntp > > > After investigating the "/tmp/time_adjust.log" > file, I was quite surprised > by the amount of drift found on one particular server. > Consider this extract > from the file: > > 6 Nov 20:00:01 ntpdate[19373]: step time server 1.2.3.4 > offset -60.504153 > sec > 6 Nov 20:00:52 ntpdate[19666]: step time server 1.2.3.4 > offset -8.735440 > sec > 6 Nov 20:01:00 ntpdate[19689]: step time server 1.2.3.4 > offset -1.635632 > sec > 6 Nov 20:54:06 ntpdate[24198]: step time server 1.2.3.4 > offset -415.894712 > sec > 6 Nov 21:01:01 ntpdate[24920]: adjust time server 1.2.3.4 > offset 0.136833 > sec > 6 Nov 22:01:02 ntpdate[29943]: adjust time server 1.2.3.4 > offset -0.114253 > sec > 6 Nov 23:01:01 ntpdate[2519]: adjust time server 1.2.3.4 > offset -0.036345 > sec > 7 Nov 00:01:00 ntpdate[7577]: step time server 1.2.3.4 > offset -1.064935 sec > 7 Nov 01:00:57 ntpdate[12697]: step time server 1.2.3.4 > offset -3.922577 > sec > 7 Nov 02:00:21 ntpdate[17733]: step time server 1.2.3.4 > offset -40.421825 > sec > 7 Nov 02:01:00 ntpdate[17777]: step time server 1.2.3.4 > offset -1.123175 > sec > 7 Nov 02:57:23 ntpdate[22542]: step time server 1.2.3.4 > offset -218.649820 > sec > 7 Nov 03:00:36 ntpdate[22900]: step time server 1.2.3.4 > offset -25.284528 > sec > 7 Nov 03:00:58 ntpdate[22940]: step time server 1.2.3.4 > offset -3.104130 > sec > 7 Nov 03:52:32 ntpdate[27430]: step time server 1.2.3.4 > offset -509.363952 > sec > 7 Nov 03:59:50 ntpdate[27943]: step time server 1.2.3.4 > offset -71.430354 > sec > 7 Nov 04:00:52 ntpdate[28236]: step time server 1.2.3.4 > offset -9.344907 > sec > 7 Nov 04:01:00 ntpdate[28259]: step time server 1.2.3.4 > offset -1.237651 > sec > 7 Nov 05:01:01 ntpdate[1363]: adjust time server 1.2.3.4 > offset 0.390149 > sec > 7 Nov 06:01:01 ntpdate[6419]: adjust time server 1.2.3.4 > offset -0.185112 > sec > 7 Nov 07:01:02 ntpdate[11493]: adjust time server 1.2.3.4 > offset -0.228884 > sec > 7 Nov 08:00:59 ntpdate[16579]: step time server 1.2.3.4 > offset -2.166519 > sec > 7 Nov 09:00:38 ntpdate[21522]: step time server 1.2.3.4 > offset -23.169420 > sec > 7 Nov 09:01:02 ntpdate[21558]: adjust time server 1.2.3.4 > offset -0.492106 > sec > 7 Nov 09:59:26 ntpdate[26329]: step time server 1.2.3.4 > offset -95.154264 > sec > 7 Nov 10:00:55 ntpdate[26639]: step time server 1.2.3.4 > offset -5.997955 > sec > 7 Nov 10:01:01 ntpdate[26658]: step time server 1.2.3.4 > offset -0.506367 > sec > > > Does anyone know what may be causing the RHEL box to drift > as much as 500 > seconds in only one hour? > > Regards, > Kenneth Holter > > > ------------------------------ > > Message: 4 > Date: Fri, 7 Nov 2008 16:15:08 +0530 > From: lingu <hicheerup@xxxxxxxxx> > Subject: Cluster Broken pipe & node Reboot > To: "General Red Hat Linux discussion list" > <redhat-list@xxxxxxxxxx> > Message-ID: > <29e045b80811070245t1c303530xbf58626227638260@xxxxxxxxxxxxxx> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi all, > > I am running two node RHEL3U8 cluster of below cluster > version on > HP servers connected via scsi channel to HP Storage (SAN) > for oracle > database server. > > Kernel & Cluster Version > > Kernel-2.4.21-47.EL #1 SMP > redhat-config-cluster-1.0.7-1-noarch > clumanager-1.2.26.1-1-x86_64 > > > Suddenly my active node got rebooted after analysed the > logs it is > throwing below errors on syslog.I want to know what might > cause this > type of error and also after analysed the sar output > indicates there > was no load on the server at the time system get rebooted > as well as > on the time i am getting I/O Hang error. > > Nov 3 14:23:00 cluster1 clulockd[1996]: <warning> > Denied 20.1.2.162: > Broken pipe > Nov 3 14:23:00 cluster1 clulockd[1996]: <err> select > error: Broken pipe > Nov 3 14:23:06 cluster1 clulockd[1996]: <warning> > Denied 20.1.2.162: > Broken pipe > Nov 3 14:23:06 cluster1 clulockd[1996]: <err> select > error: Broken pipe > Nov 3 14:23:13 cluster1 cluquorumd[1921]: <warning> > Disk-TB: Detected > I/O Hang! > Nov 3 14:23:15 cluster1 clulockd[1996]: <warning> > Denied 20.1.2.161: > Broken pipe > Nov 3 14:23:15 cluster1 clulockd[1996]: <err> select > error: Broken pipe > Nov 3 14:23:12 cluster1 clusvcmgrd[2011]: <err> > Unable to obtain > cluster lock: Connection timed out > > Nov 5 17:18:00 cluster1 cluquorumd[1921]: <warning> > Disk-TB: Detected > I/O Hang! > Nov 5 17:18:00 cluster1 clulockd[1996]: <warning> > Denied 20.1.2.162: > Broken pipe > Nov 5 17:18:00 cluster1 clulockd[1996]: <err> select > error: Broken pipe > Nov 5 17:18:17 cluster1 clulockd[1996]: <warning> > Denied 20.1.2.162: > Broken pipe > Nov 5 17:18:17 cluster1 clulockd[1996]: <err> select > error: Broken pipe > Nov 5 17:18:17 cluster1 clulockd[1996]: <warning> > Potential recursive > lock #0 grant to member > #1, PID1962 > > > I need some one help in guiding how to fix out this error > and also > the real cause for such above errors . > > Attached my cluster.xml file. > > > > <?xml version="1.0"?> > <cluconfig version="3.0"> > <clumembd broadcast="yes" > interval="1000000" loglevel="5" > multicast="no" multicast_ipaddress="" > thread="yes" tko_count="25"/> > <cluquorumd loglevel="7" > pinginterval="5" tiebreaker_ip=""/> > <clurmtabd loglevel="7" > pollinterval="4"/> > <clusvcmgrd loglevel="7"/> > <clulockd loglevel="7"/> > <cluster config_viewnumber="4" > key="6672bc0a71be2ec9486f6a2f5846c172" > name="ORACLECLUSTER"/> > <sharedstate driver="libsharedraw.so" > rawprimary="/dev/raw/raw1" > rawshadow="/dev/raw/raw2" > type="raw"/> > <members> > <member id="0" name="cluster1" > watchdog="yes"/> > <member id="1" name="cluster2" > watchdog="yes"/> > </members> > <services> > <service checkinterval="10" > failoverdomain="oracle_db" id="0" > maxfalsestarts="0" maxrestarts="0" > name="database" > userscript="/etc/init.d/script_db.sh"> > <service_ipaddresses> > <service_ipaddress broadcast="None" > id="0" > ipaddress="20.1.2.35" monitor_link="1" > netmask="255.255.0.0"/> > </service_ipaddresses> > <device id="0" > name="/dev/cciss/c0d0p1" > sharename=""> > <mount forceunmount="yes" > fstype="ext3" mountpoint="/vol1" > options="rw"/> > </device> > <device id="1" > name="/dev/cciss/c0d0p2" > sharename=""> > <mount forceunmount="yes" > fstype="ext3" mountpoint="/vol2" > options="rw"/> > </device> > <device id="2" > name="/dev/cciss/c0d0p5" > sharename=""> > <mount forceunmount="yes" > fstype="ext3" mountpoint="/vol3" > options="rw"/> > </device> > > </service> > </services> > <failoverdomains> > <failoverdomain id="0" > name="oracle_db" ordered="no" > restricted="yes"> > <failoverdomainnode id="0" > name="cluster1"/> > <failoverdomainnode id="1" > name="cluster2"/> > </failoverdomain> > </failoverdomains> > </cluconfig> > > Regards, > Lingu > > > > ------------------------------ > > __ > redhat-list mailing list > Unsubscribe > mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe > https://www.redhat.com/mailman/listinfo/redhat-list > > End of redhat-list Digest, Vol 57, Issue 7 > ****************************************** -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list