On 11/22/2016 05:43 PM, NeilBrown wrote: > On Wed, Nov 23 2016, Steve Dickson wrote: > >> [Resent due to mailman rejecting the HTML subpart] > (and the resend included HTML too ... how embarrassing :-) Yeah... :-) I guess an upgrade turned it on.. > >> >> Hey Neil, >> >> >> On 08/18/2016 09:45 PM, NeilBrown wrote: >>> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.") >>> >>> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED, >>> which maps to EOPNOTSUPP, is not a permanent error. >>> This useful because when an NFS server starts up there is a small window between >>> the moment that rpcbind (or portmap) starts responding to lookup requests, >>> and the moment when nfsd registers with rpcbind. During that window >>> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up. >>> >>> This same reasoning applies to foreground mounts. They don't wait for >>> as long, but could still hit the window and fail prematurely. >>> >>> So revert the above patch and instead add EOPNOTSUPP to the list of >>> temporary errors known to nfs_is_permanent_error. >>> >>> Signed-off-by: NeilBrown <neilb@xxxxxxxx> >>> --- >>> utils/mount/stropts.c | 7 +++---- >>> 1 file changed, 3 insertions(+), 4 deletions(-) >>> >>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c >>> index 9de6794c6177..d5dfb5e4a669 100644 >>> --- a/utils/mount/stropts.c >>> +++ b/utils/mount/stropts.c >>> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error) >>> case ETIMEDOUT: >>> case ECONNREFUSED: >>> case EHOSTUNREACH: >>> + case EOPNOTSUPP: /* aka RPC_PROGNOTREGISTERED */ >> I think this introduced a regression... When the server does not support >> a protocol, say UDP, this patch cause the mount to hang forever, >> which I don't think we want. > > > I think we do want it to wait a while so that the nfs server has a > chance to start up. We have no guarantee that the NFS server will be > registered with rpcbind before rpcbind responds to requests. I do see this race but there it has to be a small window. With Fedora its under seconds between the time rpcbind started and the NFS server. > > I disagree with the "hang forever" description. I just tested after > disabling UDP on an nfs server, and the delay was 2 minutes, 5 seconds > before a failure was reported. It might be longer when trying TCP on a > server that only supports UDP. Yeah I did not wait that long... You are much more of a patient man than I ;-) I do think this is a regression. Going an from an instant failure to one that takes over 2min is not a good thing... IMHO. > > So I think the current behavior is correct. You might be able to argue > that certain error codes should trigger a shorter timeout, but it would > need a strong argument. Going with the theory the window is very small, how about a retry with a timeout then a failure? > > Or maybe you mean that a "bg" mount would "hang forever" in the > background? I think that behavior is correct too. I agreed... "bg" mounts should hang longer than fg mounts but they shouldn't for something that will never happen like the non-support of a protocol. steved. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html