Re: RE: Errors trying to login to LT000: ... 1006:Not Allowed

Britt Treece <britt.treece@xxxxxxxxxx> · Mon, 05 Mar 2007 22:51:26 -0600

Title: Re:  RE: Errors trying to login to LT000: ... 1006:Not Allowed

Not sure why my first post didn’t, but here it is...

---

I am running a 13 node GFS (6.0.2.33) cluster with 10 mounting clients and 3 dedicated lock servers.  The master lock server was rebooted and the next slave in the voting order took over.  At that time 3 of the client nodes started receiving login errors for the ltpx server

Mar  4 00:05:52 lock1 lock_gulmd_core[3798]: Master Node Is Logging Out NOW! 

... 

Mar  4 00:05:52 lock2 lock_gulmd_core[24627]: Master Node has logged out. 

Mar  4 00:05:54 lock2 lock_gulmd_core[24627]: I see no Masters, So I am Arbitrating until enough Slaves talk to me. 

Mar  4 00:05:54 lock2 lock_gulmd_LTPX[24638]: New Master at lock2 :192.168.1.3 

Mar  4 00:05:56 lock2 lock_gulmd_core[24627]: Now have Slave quorum, going full Master. 

Mar  4 00:11:39 lock2 lock_gulmd_core[24627]: Master Node Is Logging Out NOW! 

… 

Mar  4 00:05:52 client1 kernel: lock_gulm: Checking for journals for node "lock1 " 

Mar  4 00:05:52 client1 lock_gulmd_core[9383]: Master Node has logged out. 

Mar  4 00:05:52 client1 kernel: lock_gulm: Checking for journals for node "lock1 " 

Mar  4 00:05:56 client1 lock_gulmd_core[9383]: Found Master at lock2 , so I'm a Client. 

Mar  4 00:05:56 client1 lock_gulmd_core[9383]: Failed to receive a timely heartbeat reply from Master. (t:1172988356370685 mb:1)

Mar  4 00:05:56 client1 lock_gulmd_LTPX[9390]: New Master at lock2 :192.168.1.3 

Mar  4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT002: (lock2 :192.168.1.3) 1006:Not Allowed 

Mar  4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT000: (lock2 :192.168.1.3) 1006:Not Allowed 

Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT000: (lock2 :192.168.1.3) 1006:Not Allowed 

Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT002: (lock2 :192.168.1.3) 1006:Not Allowed 

Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT004: (lock2 :192.168.1.3) 1006:Not Allowed 

Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT001: (lock2 :192.168.1.3) 1006:Not Allowed

---

Britt

On 3/5/07 10:30 PM, "Treece, Britt" <Britt.Treece@xxxxxxxxxx> wrote:

All, 

After much further investigation I found /etc/hosts is off by one for these 3 client nodes on all 3 lock servers.  Having fixed the typo's is it safe to assume that the root of the problem trying to login to LTPX is that /etc/hosts on the lock servers was wrong for these nodes?  If yes, why would these 3 clients be allowed into the cluster when it was originally started being that they had incorrect entries in /etc/hosts?

Regards, 

Britt Treece 

--

Linux-cluster mailing list

Linux-cluster@xxxxxxxxxx

https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster