When a large multi user, multi file, multi thread simulation of a total file output of 18GB is run, I plot the output of vmstat 1 and see a definite pattern with is very periodic. The bo values start at around 200MB, then drop down to 0 in most cases for a few seconds, then spike to ~700MB/s then eases back down to 200, 150 and back down to 0. It looks very much like a cacheing issue to me. These numbers are almost identical on the FC switches.
How are your test files distributed across directories, and what is your ratio of reads to writes? Are you mounting with noatime,nodiratime,noquota? What is your clustering network connection?
If all your files are in the same directory (or a small number of subdirectories) and the access is distributed across all the nodes, then I have to say that you may well be out of luck and what you are seeing is normal. Bouncing directory locks between the nodes on each access will introduce enough latency to kill the performance. Also remember that no two nodes can have a lock on the same file at the same time, and for file creation/deletion, that means a directory lock, which in turn means only one file creation/deletion per directory at any one time.
I can well believe the 900MB/s figure if you are just reading back one big file from multiple nodes. But the performance will fall off a cliff on random I/O involving writes.
Gordan -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster