Re: GFS load average and locking

Wendy Cheng <wcheng@xxxxxxxxxx> · Thu, 09 Mar 2006 15:32:56 -0500

Marc Grimme wrote:

Although the strace does not show the output I know of the problem description 
sounds like a deja vu.
We had loads of problems with having sessions on GFS and httpd s ending up 
with "D" state for some time (at high load times we had ServerLimit httpd in 
D per node which ended up in the service not being available). 
As I posted already we think it is because of the "bad" locking of sessions 
with php (as php sessions are on gfs and strace showed those timeouts with 
the session files). When you issue a "session_start" or what ever that 
function is called, the session_file is locked via an flock syscall. That 
lock is held until you end the session which is implicitly done when the tcp 
connection to the client is ended. Now comes another http process (on 
whatever node) and calls a "session start" and trys an flock on that session 
while another process already holds that lock. The process might end up in 
the seen timeouts (30-60secs) which (as far as I remember relates to the 
timeout of the tcp connection defined in the httpd.conf or some timeout in 
the php.ini) - there is an explanation on this but I cannot rember ;-) ). 
Nevertheless in our scenario the problems were the "bad" session handling by 
php. We have made a patch for the phplib where you can disable the locking, 
or just implicitly do locking and therefore keep consitency while session 
data is read or written. We could make apache work as expected and now we 
don't see any "D" process anymore since a year.
Oh yes the patch can be found at
www.opensharedroot.org in the download section.

Besides: You will never encounter this on a localfilesystem or nfs (as nfs 
ignores flocks). As nfs does not support flocks and silently ignores them.

Hi,

This does look like the problem description sent out by savvis.net folks 
during our off-list email exchanges. However, without actually looking 
at the thread traces (when they are in D state), it is difficult to be 
sure. One way to obtain the exact thread trace is using "crash" tool to 
do a back trace (e.g. "bt <pid>", you need kernel debuginfo RPM though). 
Britt, do let us know whether this php patch helps and/or using crash 
command to obtain the thread trace output.

On the other hand, I don't understand how a local (non-cluster) 
filesystem can be immune from this problem ?

-- Wendy

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster