Hi Axel Thimm schrieb: >The typical NFS cluster setups seem to fail for Gigabit NFS/tcp. Some >clients that are busy during the relocation of services either bail >out with RPC garbage, or set the filesytem to EACCES, or timeout for >17 min. > > we observe this problem to, using NFS over TCP. Mounting the filesystem with -o tcp,timeo=600,retrans=1 reduces the timeout for about one minute on Linux and Solaris 10. Greetings Hansjörg >This has to do with some racing/timing in the NFS vs ip setup/teardown >procedure. Protecting the service startup/shutdown with an iptables >rule is a good workaround to fix this. > >But what is the proper way to integrate this workaround? I could setup >new resource agents, one with start=1 and another with start=6 to >start/stop dropping packages. Or I could modify the current resource >agents to allow for child entities and wrap one script around the >service and one in the inner element. > >I could probably also hack ip.sh to introduce some delay, to make sure >the NFS services are really up/down before proceeding. Or maybe fix >the true evil by making nfsexport.sh wait for NFS startup/stop >completion (how?)? > >What's the best way? > > >------------------------------------------------------------------------ > >-- > >Linux-cluster@xxxxxxxxxx >http://www.redhat.com/mailman/listinfo/linux-cluster > -- _________________________________________________________________ Dr. Hansjoerg Maurer | LAN- & System-Manager | Deutsches Zentrum | DLR Oberpfaffenhofen f. Luft- und Raumfahrt e.V. | Institut f. Robotik | Postfach 1116 | Muenchner Strasse 20 82230 Wessling | 82234 Wessling Germany | | Tel: 08153/28-2431 | E-mail: Hansjoerg.Maurer@xxxxxx Fax: 08153/28-1134 | WWW: http://www.robotic.dlr.de/ __________________________________________________________________ There are 10 types of people in this world, those who understand binary and those who don't. -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster