Re: [Linux-cluster] GFS 6.0 Questions

Michael Conrad Tadpol Tilstra <mtilstra@xxxxxxxxxx> · Tue, 15 Feb 2005 11:43:26 -0600

Gerald G. Gilyeat wrote:

[snip]
First, the GFS side of things is currently sharing the cluster's 
internal network for it's communications, mostly because we didn't have 
a second switch to dedicate to the task. While the cluster is currently 
lightly used, how sub-optimal is this? I'm currently searching for 
another switch that a partnering department has/had, but I don't know if 
they even know where it is at this point.

It really depends on how much the actual link is used.  The more data 
that the other apps are pushing over the ethernet, the less of it gulm 
can use.  It is also rather (unfortunately) difficult to tell gulm to 
use a different network device in the current releases.  There is a fix 
pending for this, but its not out yet.

Second: GFS likes to fence "e0" off on a fairly regular/common basis 
(once every other week or so, if not more often). This is really rather 
bad for us, from an operational standpoint - e0  is vital to the 
operation of our Biostatistics Department (Samba/NFS, user 
authentication, etc...). There is also some pretty nasty latency on 
occasion, with logins taking upwards of 30seconds to return to a prompt, 
providing it doesn't time out to begin with.

If the machine is getting this kind of delay, it is completely possible 
that the delay is also causing heartbeats to be missed.

In trying to figure out -why- it's constantly being fenced off, and in 
trying to solve the latency/performance issues, I've noticed a -very- 
large number of "notices" from GFS like the following:

Feb 15 10:56:10 front-1 lock_gulmd_LT000[4073]: Lock count is at 1124832 
which is more than the max 1048576. Sending Drop all req to clients

Easy enough to gather that we're blowing away the current lock highwater 
mark.

Is upping the highwater point a feasable thing to do -and- would it have 
an affect on performance, and what would that affect be?

cluster.ccs:
cluster {
 lock_gulm {
   ....
   lt_high_locks = <int>
 }
}

The highwater mark is an attempt to keep the amount of memory lock_gulmd 
uses down.  When the highwater is hit, the lock server tells all gfs 
mounts to try and release locks.  It does this every 10 seconds until 
the lock count falls below the highwater mark.  This requires cycles, 
and so not doing it means less cycles used.  The higher the highwater 
mark is, the more memory the gulm lock servers and gfs will use to store 
locks.  The number is just the count of locks (in <=6.0) and not an 
actual representation of ram used.

In short summery, in your case, a higher highwater mark may give some 
performance gained, at the loss of some memory available to other programs.

This weekend, we also noticed another weirdness (for us, anyways...) - 
e0 was fenced off on Saturday morning at 0504.09am, almost precisely 24 
hours later e0 decided that the problem was the previous GFS master 
(f0), arbitrated itself to be Master, took over, fenced off F0 and then 
proceeded to hose the entire thing by the time I heard about things and 
was able to get on-site to bring it all back up (at 1am Monday morning). 
What is this apparent 24-hour timer, and is this expected behaviour?

No, it sounds like some kind of freak chance.  A very icky thing indeed. 
 Very much sounds like a higher heartbeat_rate is needed.

Finally - would increasing the heartbeat timer and the number of 
acceptable misses an appropriate and acceptable way to help decreases 
the frequency of e0 being fenced off?

Certainly.  The default values for the heartbeat_rate and allowed_misses 
are just suggestions.  Certain setups may require different values, and 
as far as I know the only way to figure this out is to try it.  Sounds 
very much like you could use larger values.

--
michael conrad tadpol tilstra
<my wit is my doom>
Attachment:
signature.asc

Description: OpenPGP digital signature