The typical NFS cluster setups seem to fail for Gigabit NFS/tcp. Some clients that are busy during the relocation of services either bail out with RPC garbage, or set the filesytem to EACCES, or timeout for 17 min. This has to do with some racing/timing in the NFS vs ip setup/teardown procedure. Protecting the service startup/shutdown with an iptables rule is a good workaround to fix this. But what is the proper way to integrate this workaround? I could setup new resource agents, one with start=1 and another with start=6 to start/stop dropping packages. Or I could modify the current resource agents to allow for child entities and wrap one script around the service and one in the inner element. I could probably also hack ip.sh to introduce some delay, to make sure the NFS services are really up/down before proceeding. Or maybe fix the true evil by making nfsexport.sh wait for NFS startup/stop completion (how?)? What's the best way? -- Axel.Thimm at ATrpms.net
Attachment:
pgpUs9X2jNkCb.pgp
Description: PGP signature
-- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster