RE: GFS performance

"Treece, Britt" <Britt.Treece@xxxxxxxxxx> · Wed, 12 Jul 2006 11:51:10 -0500

Paul,

Does the GFS lock traffic have its own interface
on each server and segmented switch network?

What does the perl script do?

Regards,

Britt Treece

From:
linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Paul n McDowell

Sent: Wednesday, July 12, 2006
10:49 AM

To: linux-cluster@xxxxxxxxxx

Subject:  GFS
performance

Hi, 

Can
anyone provide techniques or suggestions to improve GFS performance?

The
problem we have is best summarized by the output of a perl script that one of
our frustrated developers has written: 

The
script simply creates a file, then reads it, then removes it and prints out the
time it takes for each of these instructions to complete.  Sometimes it
takes only a second to do all three, while sometimes it takes as long as 15
seconds to do these 3 simple instructions.  I've run the script on all of
the five GFS file systems and see the same response characteristics.  I've
run the script when the systems and when they are fairly quiet and still see
the same issue. 

To
summarize the environment: 

RH
ES3 update 6 (GFS 6.0) 11 node cluster environment with 3 redundant lock
managers and five GFS file systems mounted on each participating servers.  The
GFS file systems range from 100GB to 1.5 TB. 

The
storage array is an EMC CX700 attached to a dual redundant SAN consisting four
2GB Brocade 3900 SAN switches.   

The
HBA's in all servers are Qlogic qla2340's  Firmware version:  3.03.14,
Driver version 7.07.00 

The
servers are all HP DL585 64bit AMD opteron server class machines each
configured with between 8GB and 32GB of memory. 

.....................................................................................................................................................................................................................................................

I've
raised a support call with RedHat but according to their experts our
configuration seems already to be set for optimum performance. 

RedHat
provide a utility to get and set tunable gfs file systems parameters but there
is next to no supporting documentation. 

So,
is there anything I can do or am I missing something obvious that is just
plainly mis-configured? 

Shown
below is the GFS configuration summary derived from lock_gulmd -C and gfs_tool
df for each file system. 

I'll
be happy to supply any other information if it will help. 

Thanks
to all in advance 

Paul
McDowell 

lock_gulmd
-C 

#
hashed: 0x44164246 

cluster
{ 

 name
= "cra_gfs" 

 lock_gulm
{ 

heartbeat_rate = 15.000 

allowed_misses = 3 

coreport = 40040 

new_connection_timeout = 15.000 

# server cnt: 3 

# servers = ["iclc1g.cra.applera.net",
"iclc2g.cra.applera.net", "ccf001g.cra.applera.net"]

servers = ["172.20.8.21", "172.20.8.22",
"172.20.8.51"] 

lt_partitions = 4 

lt_base_port = 41040 

lt_high_locks = 20971520 

lt_drop_req_rate = 300 

prealloc_locks = 5000000 

prealloc_holders = 11000000 

prealloc_lkrqs = 60 

ltpx_port = 40042 

#gfs_tool
df 

/crx:

SB lock proto = "lock_gulm" 

SB lock table = "cra_gfs:cra_crx" 

SB ondisk format = 1308 

SB multihost format = 1401 

Block size = 4096 

Journals = 11 

Resource Groups = 1988 

Mounted lock proto = "lock_gulm" 

Mounted lock table = "cra_gfs:cra_crx" 

Mounted host data = ""> 

Journal number = 0 

Lock module flags = async 

Local flocks = FALSE 

Local caching = FALSE 

Type           Total          Used
          Free           use%

------------------------------------------------------------------------

inodes         933593         933593  
      0              100%

metadata       943899         121868    
    822031         13% 

data           128274180      58546879  
    69727301       46% 

[root@iclc1g
tmp]# gfs_tool df /crx/data 

/crx/data:

SB lock proto = "lock_gulm" 

SB lock table = "cra_gfs:cra_crxdata" 

SB ondisk format = 1308 

SB multihost format = 1401 

Block size = 4096 

Journals = 11 

Resource Groups = 5970 

Mounted lock proto = "lock_gulm" 

Mounted lock table = "cra_gfs:cra_crxdata" 

Mounted host data = ""> 

Journal number = 0 

Lock module flags = async 

Local flocks = FALSE 

Local caching = FALSE 

Type           Total          Used
          Free           use%

------------------------------------------------------------------------

inodes         3296091        3296091  
     0              100%

metadata       2649271        616186    
    2033085        23% 

data           385236382      310495360
     74741022       81% 

[root@iclc1g
tmp]# gfs_tool df /crx/home 

/crx/home:

SB lock proto = "lock_gulm" 

SB lock table = "cra_gfs:cra_crxhome" 

SB ondisk format = 1308 

SB multihost format = 1401 

Block size = 4096 

Journals = 11 

Resource Groups = 3978 

Mounted lock proto = "lock_gulm" 

Mounted lock table = "cra_gfs:cra_crxhome" 

Mounted host data = ""> 

Journal number = 0 

Lock module flags = async 

Local flocks = FALSE 

Local caching = FALSE 

Type           Total          Used
          Free           use%

------------------------------------------------------------------------

inodes         3477487        3477487  
     0              100%

metadata       3162164        341627    
    2820537        11% 

data           254032093      157709829
     96322264       62% 

[root@iclc1g
tmp]# gfs_tool df /usr/local 

/usr/local:

SB lock proto = "lock_gulm" 

SB lock table = "cra_gfs:cra_usrlocal" 

SB ondisk format = 1308 

SB multihost format = 1401 

Block size = 4096 

Journals = 11 

Resource Groups = 394 

Mounted lock proto = "lock_gulm" 

Mounted lock table = "cra_gfs:cra_usrlocal" 

Mounted host data = ""> 

Journal number = 0 

Lock module flags = async 

Local flocks = FALSE 

Local caching = FALSE 

Type           Total          Used
          Free           use%

------------------------------------------------------------------------

inodes         765762         765762  
      0              100%

metadata       582989         22854    
     560135         4% 

data           24393837       9477084  
     14916753       39% 

[root@iclc1g
tmp]# gfs_tool df /data 

/data:

SB lock proto = "lock_gulm" 

SB lock table = "cra_gfs:cra_GQ" 

SB ondisk format = 1308 

SB multihost format = 1401 

Block size = 4096 

Journals = 11 

Resource Groups = 1298 

Mounted lock proto = "lock_gulm" 

Mounted lock table = "cra_gfs:cra_GQ" 

Mounted host data = ""> 

Journal number = 0 

Lock module flags = async 

Local flocks = FALSE 

Local caching = FALSE 

Type           Total          Used
          Free           use%

------------------------------------------------------------------------

inodes         10026          10026
         0              100%

metadata       282680         189037    
    93643          67% 

data           103761726      94277221  
    9484505        91% 

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster