Hey, First I used the default values ( rate - 0.3, misses - 1). When I saw the problem I tried to increase the misses value to 10, and then it sometimes work and sometimes not. how do I add loginloops ? Thank you ! Oved On Thu, 24 Mar 2005 10:06:18 -0600, Michael Conrad Tadpol Tilstra <mtilstra@xxxxxxxxxx> wrote: > On Wed, Mar 23, 2005 at 02:11:00PM +0200, Oved Ourfali wrote: > > I have GFS version 6 installes on rhl es3 update 3. > > The GFS includes 3 nodes, a, b and c. > > > > The three nodes run the lock_gulm daemon, and thus it runs in RLM mode. > > > > I have done some tests to check that the GFS works correctly, and i > > ran into some thing very weird: > > Lets assume the master is A, and B and C are slaves. > > Disconnecting B or C from the network works fine. > > > > Disconnecting A causes a problem. Lets assume B tries to be the new > > master. B indicates that A is down, but for some reason it also thinks > > that C is down, thus it waits for enough slaves to contact him, and it > > doesn't happen. I tried to increase the timeout, and now it sometimes > > work and sometimes don't. > > > > Does anyone have a clue why it is happening ? > > For some reason C isn't finding B in time to let it know that it is > still alive. So, first question, what values are you using for > heartbeat_rate and allowed_misses? Are you seeing this with the > defaults? or are you using something else? (before increasing it) > > Also, you can add LoginLoops to the verbosity setting to have gulm print > out much more detail when it is trying to connect and find the master > server. > > -- > Michael Conrad Tadpol Tilstra > BE ALERT!!!! (The world needs more lerts ...) > >