performance on a 4 node cluster after 6/7 days

"Support @ Sylconia" <hosting@xxxxxxxxxxx> · Tue, 09 Jan 2007 18:59:39 +0100

Dear reader,

short version:
we are experiencing performance problems after 6/7 days of running time on the non lock master(s).

long version:
We have the following setup:

a 4 node RHCS cluster where node 4 (backend) exports 3 raid disks via gnbd to the other nodes (1-3)
The other 3 nodes (frontend) import those exports via gnbd. We have created 4 LV's via clvmd on top of those imported disks.

logical volumes
/dev/mapper/vg0-tmp   9.7G  1.1M  9.7G   1% /phpsessions
/dev/mapper/vg0-config 9.7G  152K  9.7G   1% /config
/dev/mapper/vg0-logging 100G  126M  100G   1% /var/log/httpd
/dev/mapper/vg0-www   500G  222M  500G   1% /www

as lock manager we use lock_dlm with following rpm's installed 
dlm-kernel-2.6.9-44.3
dlm-1.0.1-1

gfs version
gfs_tool -V
gfs_tool 6.1.6 (built Aug 25 2006 15:17:50)

gnbd version
Copyright (C) Red Hat, Inc.  2004-2005  All rights reserved.
gnbd_import 1.0.8. (built Nov 14 2006 02:18:52)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.

cman_tool status
Protocol version: 5.0.1

os version centos 4.4 on all nodes

all rpm's are from the centos.org website.

all nodes are connected via a seperate NIC (GB) and private gigabit VLAN no other network traffic is on this VLAN.

Now this is all running fine till 6 or 7 days running time than the nodes which are not lock master are becoming very slow in for example a df command. while df runs the cpu load rises to 4 or 5 and the node is not very responsive (it seems the os hangs for a few seconds)

Running the top command at the same time shows 
18524 root      15 -10     0    0    0 R 97.4  0.0   1:08.99 dlm_sendd
12959 root      18   0  4184  592  528 R  1.9  0.1   0:00.17 df

so i think the problem is in dlm but i do not know how to debug this can someone give me some pointers? I checked /proc/cluster/dlm* but honestly do not know what to look for. 

regards
Constan
Sylconia.nl

---- This message was sent via a demo version of  - http://atmail.com/

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster