Hi Bob/Steven/Ben - many thanks for responding.
There is some helpful stuff here on the tuning side: http://sources.redhat.com/cluster/wiki/FAQ/GFS#gfs_tuning
Indeed, we have implemented many these suggestions, "fast statfs" is on, -r 2048 was used, quotas off, the cluster interconnect is a dedicated gigabit LAN, hardware RAID (RAID10) on the SAN, and so on. Maybe we are just at the limit of the hardware.
I have also asked and it seems the one issue that might cause slowdown, multiple nodes all trying to access the same inode (say all updating files in a common directory), should not happen with our application. I am told that essentially batch jobs will create their own working directory when executing, and work almost exclusively within that subtree. Interactive work is in another tree entirely.
However I'd like to double check that - but how? When we looked at Lustre for a similar app there was a /proc interface that you could probe to see what files were being opened/read/written/closed by each connected node - does GFS offer something similar? Would mounting debugfs help me there?
Kevin -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster