Hi, On 10/11/15 04:56, Dil Lee wrote:
Hi, I have a centos 6.5 cluster that are connected to a Fibre Channel SAN in star topology. All nodes/SAN_storages have single-pair fibre connection and no multipathing. Possibility of hardware issue had been eliminated because read/write between all other node/SAN_storage pairs works perfectly. Problem: Everything was running perfectly for years. Then node3 suddenly has very slow write to SAN_storage1, ~10KB/sec. Read speed seems to remain normal. Can anyone give be some pointers to debug the problem. Thank you. Dil
The usual things to look for are the load being asymetric across the nodes, the fileystem becoming full (although if the other nodes are still working at higher speed, that is less likely) an increase in the amount of cross-node invalidation due to locking, or some reason for the communications being slower to that node (i.e. packet loss, or similar)
One way to help debug this would be to look at the block device with blktrace and see if there are any obvious differences in latency of reads/writes between the faster and slower nodes, and whether that is down to access pattern or not.
So there are several possible things to follow up on to help narrow the issue down,
Steve. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster