On Thu, 2008-04-17 at 09:08 +0200, Peter wrote: > Hi! > > In our Cluster we have the following entry in the "messages" logfile: > > "qdiskd[4314]: <warning> qdisk cycle took more than 3 seconds to > complete (3.890000)" It means it took more than 3 seconds for one qdiskd cycle to complete. This is a whole lot: 8192 bytes in 16 block reads some internal calculations 512 bytes in 1 block write (that's it...) > Theese messages are very frequent. I can not find anything except the > source code via google and i am sorry to say that i am not so familar > with c to get the point. > > > We also have sometimes a quorum timeout: > > "kernel: CMAN: Quorum device /dev/sdh timed out" > > > Are theese two messages independent and what is the meaning of the > first message? No, they're 100% related. It sounds like qdiskd is getting starved for I/O to /dev/sdh, or possibly it's getting CPU-starved for some reason. Being that it's more or less a real-time program which helps keep the cluster running, that's bad! In your case, it's getting hung up for longer than the cluster failover time, so CMAN thinks qdiskd has died. Not good. (1) Turn *off* status_file if you have it enabled! It's for debugging, and under certain load patterns, it can really slow down qdiskd. (2) If you think it's I/O, what you should try is (assuming you're using cluster2/rhel5/centos5/etc. here): echo deadline > /sys/block/sdh/queue If you had a default of 10 seconds (1 interval 10 tko), you should also do: echo 2500 > /sys/block/sdh/queue/iosched/write_expire ... you've got at least 3 for interval, so I'm not sure this would apply to you. [NOTE: On rhel4/centos4/stable, I think you have to set the I/O scheduler globally in the kernel command line at system boot.] (3) If you think qdiskd is getting CPU starved, you can adjust the 'scheduler' and 'priority' values in cluster.conf to something different. I think the man page might be wrong; I think the highest 'priority' value for the 'rr' scheduler is 99, not 100. See the qdisk(5) man page for more information on those. -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster