On Thu, Aug 18 2016, Chuck Lever wrote: >> On Aug 17, 2016, at 9:32 PM, NeilBrown <neilb@xxxxxxxx> wrote: >> >> On Wed, Aug 17 2016, J. Bruce Fields wrote: >>>> >>>> >>>> There is another issue related to this that I've been meaning to >>>> mention. It related to the start-up ordering rather than shut down. >>>> >>>> When you try to mount an NFS filesystem and the server isn't responding, >>>> mount.nfs retries for a little while and then - if "bg" is given - it >>>> forks and retries a bit longer. >>>> While it keeps gets failures that appear temporary, like ECONNREFUSED or >>>> ETIMEDOUT (see nfs_is_permanent_error()) it keeps retrying. >>>> >>>> There is typically a window between when rpcbind starts responding to >>>> queries, and when nfsd has registered with it. If mount.nfs sends an >>>> rpcbind query in this window. It gets RPC_PROGNOTREGISTERED which >>>> nfs_rewrite_pmap_mount_options maps to EOPNOTSUPP, and >>>> nfs_is_permanent_error() thinks that is a permanent error. >>> >>> Looking at rpcbind(8).... Shouldn't "-w" prevent this by loading some >>> registrations before it starts responding to requests? >> >> "-w" (which isn't listed in the SYNOPSIS!) only applies to a warm-start >> where the daemons which previously registered are still running. >> The problem case is that the daemons haven't registered yet (so we don't >> necessarily know what port number they will get). >> >> To address the issue in rpcbind, we would need a flag to say "don't >> respond to lookup requests, just accept registrations", then when all >> registrations are complete, send some message to rpcbind to say "OK, >> respond to lookups now". That could even be done by killing and >> restarting with "-w", though that it a bit ugly. > > An alternative would be to create a temporary firewall rule that > blocked port 111 to remote connections. Once local RPC services > had registered, the rule is removed. > > Just a thought. Interesting ..... probably the sort of thing that I would resort to if I really needed to fix this and didn't have any source code. But fiddling with fire-wall rules is not one of my favourite things so I think I stick with a more focussed solution. Thanks! NeilBrown > > >> I'm leaning towards having mount retry after RPC_PROGNOTREGISTERED for >> fg like it does with bg. > > It probably should do that. If rpcbind is up, then the other > services are probably on their way. > > > -- > Chuck Lever
Attachment:
signature.asc
Description: PGP signature