RE: GFS load average and locking

Wendy Cheng <wcheng@xxxxxxxxxx> · Thu, 09 Mar 2006 22:30:02 -0500

On Thu, 2006-03-09 at 17:04 -0600, Treece, Britt wrote:

> Is Redhat aware of any issues with GFS and flock syscalls?  

Just checked kernel source and got a rough idea what could go wrong. In
RHEL 3 (linux 2.4 based) kernel, flock has the following logic:

1. lock_kernel (Big Kernel Lock - BKL)
2. call filesystem-specific supplemental lock
3. handle linux vfs flock
4. unlock_kernel

There are two issues here:

* performance

Step 2 is a noop for most of the local filesystems (e.g. ext3) and the
code path of step 3 is relatively short. So you won't see much impacts
of BKL. For GFS, if step 2 is run concurrently (as in other cases such
as read, write, etc), it is reasonably "fast" unless you need the lock
for the very same file and/or the lock network traffic is congested.
However, adding BKL on top of that would have a big impact - it
virtually serializes *every* flock attempt. 

* deadlock

I'm a little bit fuzzy how Linux's BKL is implemented. In theory, the
above sequence would get into deadlock (unless when process goes to
sleep, it'll drop BKL), regardless whether step 2 is a noop or not. Will
ask our base kernel folks about this.

In any case, I think we need to remove that BKL if we can. At the mean
time, to work around this issue, you have to either:

* use previous mentioned PHP patch to turn off flock if you can; or
* get GFS U7 RPMs where we have two tuning parameters that could speed
up the lock process. However, I don't have quantitative data at this
moment to know how effective they'll be in this kind of situation.

-- Wendy

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster