Re: mount.nfs4 hangs when rpcbind is not reachable

Chuck Lever <chuck.lever@xxxxxxxxxx> · Fri, 23 Apr 2010 13:00:14 -0400

On 04/23/2010 12:25 PM, Jan Engelhardt wrote:

On Friday 2010-04-23 18:03, Chuck Lever wrote:

Don't ask me. When the kernel has started, lo is in the down state, and
does not have any addresses assigned either. Distros have to currently
do that themselves - usually only after the root filesystem has been
moutned. I just ran into and reported that issue where lo is down the
entire initramfs time. Needless to say NFSv3 has no problems with lo
being down.

... that we know of.  I don't think statd and lockd would work in this case,
but I've never tried it.

Well yeah, to use NFS as a root, -o nolock is commonly used.

NFSv4 is known not to work for NFSROOT (although you are using 
mount.nfs4 from an initramfs, not NFSROOT).  One problem is that 
idmapper has to be running to prevent NFSv4 deadlocks.

I'm just a little surprised because I was not aware that anyone was 
doing user space NFS mounts in an environment with no lo configured.

If you have an initramfs mounted as root, the ramfs's init scripts 
probably could get lo going before doing the mount, in this case.

NFS has never worked in this case, because there would be no way for
the kernel to communicate with user space.

Netlink and ioctls work without lo ;-)

Sure, but RPC doesn't go over ioctls :-)

Well maybe it should [go over netlink].

I'm actually planning to construct an RPC over AF_UNIX transport 
capability for the kernel.  This will mirror support for RPC over 
AF_UNIX added in user space with the introduction of libtirpc.  rpcbind 
already has an AF_UNIX listener thanks to libtirpc.

However, this work was planned for a time when lo is replaced with lo6 
in a large number of cases, which should be some time in the future. 
Your report is accelerating this use case!  :-)

In fact, you'd be surprised how much of Linux works without an enabled
lo device. Part of it may be because eth0 is up and has an address that
can be used to do loopbacking ('local 192.168.1.15 dev eth0 proto
kernel scope host src 192.168.1.15' in `ip route list table local`).

So, one way to address this would be if kernel_connect() returns a distinctive
errno in this case (I would expect something like ENETDOWN) and then have the
RPC transport behave as if it had received ECONNREFUSED.

Are you in a position to enable RPC debugging before doing that mount? If so,
you can do

  # rpcdebug -m rpc -s trans

xs_error_report client f67bb800...
error 110
xs_tcp_state change client f67bb800...
state 7 conn 0 dead 0 zapped 1
xs_tcp_send_request(44) = -118
sendmsg returned unrecognized error 110
xs_tcp_state_change client ..
[...]
disconnecting xprt f67bb800 to reuse port
[...]
worker connecting xprt f67bb800 via tcp to 127.0.0.1 (port 111)
f67bb800 connect status 115 connected 0 sock state 2
xs_tcp_send_request(88) = -11
3 xmit incomplete (88 left of 88)

and so on (repeats every 20 sec)

I'd like to see the full log captured during your test, with time 
stamps.  110 is ETIMEDOUT, which suggests the network layer is not 
reporting that the loopback interface is not up, but simply that the SYN 
is timing out.

And if you could, "^-s trans^-s trans xprt clnt sched bind".

Thanks for your help.

--
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html