Hi, I'd suggest filing a bug in the first instance. I can't see anything obviously wrong with what you are doing. The fcntl() locks go via the dlm and dlm_controld not via the glock_workqueues, so I don't think that is likely to be the issue, Steve. On Thu, 2009-12-03 at 12:42 -0800, Ray Van Dolson wrote: > We have a two node cluster primarily acting as an NFS serving > environment. Our backup infrastructure here uses NetBackup and, > unfortunately, NetBackup has no PPC client (we're running on IBM JS20 > blades) so we're approaching the backup strategy in two different ways: > > - Run netbackup client from another machine and point it to NFS share > on one of our two cluster nodes > - Run rsyncd on our cluster nodes and rsync from a remote machine. > NetBackup then backs up that machine. > > The GFS2 filesystem in our cluster only is storing about 90GB of data, > but has about one million files (inodes used reported via df -i) on it. > > (For the curious, this is a home directory server and we do break > thinsg up under a top level hierarchy of a folder for each first letter > of a username). > > The NetBackup over NFS route is extremely slow and spikes the load up > on whichever server is being backed up from. We made the following > adjustments to try and improve performance: > > - Set the following in our cluster.conf file: > > <dlm plock_ownership="1" plock_rate_limit="0"/> > <gfs_controld plock_rate_limit="0"/> > > ping_pong will give me about 3-5k locks/sec now. > > - Mounted filesystem with noatime,nodiratime,quota=off > > This seems to have helped a bit, but things are still taking a long > time. I should note here that I tried running ping_pong to one of our > cluster nodes via one of its NFS exports of the GFS2 filesystem. While > I can get 3000-5000 locks/sec locally, over NFS it was about... 2 or 3 > (not thousand, literally 2 or 3). tcpdump of the NLM port shows the > NFS lock manager on the node responding NLM_BLOCK most of the time. > I'm not sure if GFS2 or our NFS daemon is to blame... in any case... > > .. I've set up rsyncd on the cluster nodes and am sync'ing from a > remote server now (all of this via Gigabit ethernet). I'm over an hour > in and the client is still generatin the file list. strace confirms > that rsync --daemon is still trolling through, generating a list of > files on the filesystem... > > I've done a blktrace dump on my GFS2 filesystem's block device and can > clearly see glock_workqueue showing up the most by far. However, I > don't know what else I can glean from these results. > > Anyone have any tips or suggestions on improving either our NFS locking > or rsync --daemon performance beyond what I've already tried? It might > almost be quicker for us to do a full backup each time than to spend > hours building file lists for differential backups :) > > Details of our setup: > > - IBM DS4300 Storage (12 drive RAID5 + 2 spares) > - Exposed as two LUNs (one per controller) > - Don't believe this array does hardware snapshots :( > - Two (2) IBM JS20 Blades (PPC) > - QLogic ISP2312 2Gb HBA's > - RHEL 5.4 Advanced Platform PPC > - multipathd > - clvm aggregates two LUNs > - GFS2 on top of clvm > - Configured with quotas originally, but disabled later by > mounting quota=off > - Mounted with noatime,nodiratime,quota=off > > # gfs2_tool gettune /domus1 > new_files_directio = 0 > new_files_jdata = 0 > quota_scale = 1.0000 (1, 1) > logd_secs = 1 > recoverd_secs = 60 > statfs_quantum = 30 > stall_secs = 600 > quota_cache_secs = 300 > quota_simul_sync = 64 > statfs_slow = 0 > complain_secs = 10 > max_readahead = 262144 > quota_quantum = 60 > quota_warn_period = 10 > jindex_refresh_secs = 60 > log_flush_secs = 60 > incore_log_blocks = 1024 > > # gfs2_tool getargs /domus1 > data 2 > suiddir 0 > quota 0 > posix_acl 1 > upgrade 0 > debug 0 > localflocks 0 > localcaching 0 > ignore_local_fs 0 > spectator 0 > hostdata jid=1:id=196610:first=0 > locktable > lockproto > > Thanks in advance for any advice. > > Ray > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster