[Linux-cluster] LOCK_DLM Performance under Fire

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Title: Message
Hi, Everyone --
 
I've been playing around with RHEL 4 and GFS from the tar files (not CVS) on three OptiPlex GX280 workstations using hyperthreading, SATA drives, and GNBD for sharing over a 1Gb network (dual NICs per machine).  I'm exploring moving a legacy file-based COBOL application/database over to Linux on a bunch of smaller boxes vs its current home of a quad proc AIX machine.  I have a test application which basically does applies a bunch of file and record locks on and within files along with some processor intense sorting algorithms to stress test the power of the solution.  I'm running into some serious performance discrepancies of which I hope someone can help me make sense.  Here's what I'm running into when I test this app on different file systems:
 
ext3 on local disk, the test app takes about 3 min 20 sec to complete.
ext3 on GNBD exported disk (one node only, obviously); completes in about 3 min 35 sec.
GFS on GNBD mounted with the localflocks option; completes in 5 min 30 sec.
GFS on GNBD mounted using LOCK_DLM with only one server mounting the fs; completes in 50 min 45 sec.
GFS on GNBD mounted using LOCK_DLM with two servers mounting the fs; went over 80 min and wasn't even half done.
GFS on GNBD mounted using LOCK_GULM...don't want to go there; I left it running for over 2 hrs and it was worse off than the two servers using LOCK_DLM.  :)
 
The test app mostly does a whole lot of file & record level locking -- not a lot of file transfer from the source disk to the memory of the local server.  iostat on the client and server both show that the transfer rate of data on and off the hard disk is only at about 300kBs.  top shows that the cpu on the client is being beat up as the dlm_astd, lock_dlm1, and lock_dlm2 are taking on average 50% - 60% of the proc (30%, 15%, 15%) and my test app is taking up the rest.  When it's running on ext3 or GFS mounted with localflocks, there isn't this problem at all -- the test app goes to 99% of cpu; hence the faster completion times.  I have isolated the data paths so that the GNBD data is running over one NIC and the rest of the cluster data is on the second NIC in these computers.
 
Anyone have some ideas on how to tune this?  Would exporting the GNBD file system with caching enabled help as I'm not using multiple GNBD servers, just multiple GNBD clients?  Other options?  Am I just way off base here?
 
Thanks!
________________________________________
Peter Shearer
A+, MCSE, MCSE: Security, CCNA
IT Network Engineer
Lumbermens
 

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux