---
I am running a 13 node GFS (6.0.2.33) cluster with 10 mounting clients and 3 dedicated lock servers. The master lock server was rebooted and the next slave in the voting order took over. At that time 3 of the client nodes started receiving login errors for the ltpx server
Mar 4 00:05:52 lock1 lock_gulmd_core[3798]: Master Node Is Logging Out NOW!
...
Mar 4 00:05:52 lock2 lock_gulmd_core[24627]: Master Node has logged out.
Mar 4 00:05:54 lock2 lock_gulmd_core[24627]: I see no Masters, So I am Arbitrating until enough Slaves talk to me.
Mar 4 00:05:54 lock2 lock_gulmd_LTPX[24638]: New Master at lock2 :192.168.1.3
Mar 4 00:05:56 lock2 lock_gulmd_core[24627]: Now have Slave quorum, going full Master.
Mar 4 00:11:39 lock2 lock_gulmd_core[24627]: Master Node Is Logging Out NOW!
…
Mar 4 00:05:52 client1 kernel: lock_gulm: Checking for journals for node "lock1 "
Mar 4 00:05:52 client1 lock_gulmd_core[9383]: Master Node has logged out.
Mar 4 00:05:52 client1 kernel: lock_gulm: Checking for journals for node "lock1 "
Mar 4 00:05:56 client1 lock_gulmd_core[9383]: Found Master at lock2 , so I'm a Client.
Mar 4 00:05:56 client1 lock_gulmd_core[9383]: Failed to receive a timely heartbeat reply from Master. (t:1172988356370685 mb:1)
Mar 4 00:05:56 client1 lock_gulmd_LTPX[9390]: New Master at lock2 :192.168.1.3
Mar 4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT002: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT000: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT000: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT002: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT004: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT001: (lock2 :192.168.1.3) 1006:Not Allowed
---
Britt
On 3/5/07 10:30 PM, "Treece, Britt" <Britt.Treece@xxxxxxxxxx> wrote:
All,
After much further investigation I found /etc/hosts is off by one for these 3 client nodes on all 3 lock servers. Having fixed the typo's is it safe to assume that the root of the problem trying to login to LTPX is that /etc/hosts on the lock servers was wrong for these nodes? If yes, why would these 3 clients be allowed into the cluster when it was originally started being that they had incorrect entries in /etc/hosts?
Regards,
Britt Treece
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster