Re: [PATCH 1/2] mount: ECONNREFUSED is a permanent error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Oct 9, 2009, at 11:20 AM, Steve Dickson wrote:
On 10/09/2009 11:13 AM, Chuck Lever wrote:
On Oct 9, 2009, at 9:16 AM, Steve Dickson wrote:
On 10/08/2009 01:37 PM, Chuck Lever wrote:
I had assumed early on that mount.nfs should retry a refused connection.

Apparently this is not the case. Legacy mount.nfs4 fails immediately
if the NFS server refuses the connection.  Legacy mount.nfs and
text-based mount.nfs both fail immediately if the rpcbind service is
refusing connections.

What about if the server is on the way up (i.e the network is up)
but has not started the NFS service? In that window, the server will
return ECONNREFUSED since nobody is listening, but in a very short time there will be a listener... The mount should not fail in that case...

I agree, but I think it does fail today, and it has behaved this way for a long while. No one has complained about it. I'm actually not arguing in favor of either behavior; just reporting that the current behavior is
inconsistent.

With the current code, legacy and text-based v2/v3 fails immediately if
the server's rpcbind refuses connection... Legacy mount.nfs4 fails
immediately if the NFS server refuses connection. Text-based mount.nfs4
retries in this case.
I think the text-based mounts have it right...

It's a change from legacy behavior, however, so we should test carefully. The trade-off is that the mount.nfs command is less responsive because it's retrying a connection refusal, but it's more likely that the mount request will succeed.

Again, I'm not advocating for one or the other, just pointing out the compromises.

So we will either need to fix v2/v3 to continue retrying, or fix NFSv4 to stop retrying. The retries would stop after mount.nfs's retry timer expires (just like the case where the server isn't responding at all).
The former, IMHO.. I also notice that the retry timer does not work since
the mount waits in the kernel well passed the timer expiring...

It does work, after a fashion, but yes, it's less responsive than it was before. For background mounts it hardly matters because bg mounts retry for a good long while. The case where it gets a little ugly is fg, when mount.nfs's retry timer is nearly always shorter than the kernel's connect retry timeout.

I've got some kernel level fixes for this... see the SOFTCONN patches from earlier this week. Shortening the initial connect retry timeout in the kernel will also help the case where the server isn't responding at all.

Automounter might want different behavior in this case, but we should
ask around before making a final decision, probably.
Ian... What do you think??

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux