On 04/23/2010 12:25 PM, Jan Engelhardt wrote:
On Friday 2010-04-23 18:03, Chuck Lever wrote:
Don't ask me. When the kernel has started, lo is in the down state, and
does not have any addresses assigned either. Distros have to currently
do that themselves - usually only after the root filesystem has been
moutned. I just ran into and reported that issue where lo is down the
entire initramfs time. Needless to say NFSv3 has no problems with lo
being down.
... that we know of. I don't think statd and lockd would work in this case,
but I've never tried it.
Well yeah, to use NFS as a root, -o nolock is commonly used.
NFSv4 is known not to work for NFSROOT (although you are using
mount.nfs4 from an initramfs, not NFSROOT). One problem is that
idmapper has to be running to prevent NFSv4 deadlocks.
I'm just a little surprised because I was not aware that anyone was
doing user space NFS mounts in an environment with no lo configured.
If you have an initramfs mounted as root, the ramfs's init scripts
probably could get lo going before doing the mount, in this case.
NFS has never worked in this case, because there would be no way for
the kernel to communicate with user space.
Netlink and ioctls work without lo ;-)
Sure, but RPC doesn't go over ioctls :-)
Well maybe it should [go over netlink].
I'm actually planning to construct an RPC over AF_UNIX transport
capability for the kernel. This will mirror support for RPC over
AF_UNIX added in user space with the introduction of libtirpc. rpcbind
already has an AF_UNIX listener thanks to libtirpc.
However, this work was planned for a time when lo is replaced with lo6
in a large number of cases, which should be some time in the future.
Your report is accelerating this use case! :-)
In fact, you'd be surprised how much of Linux works without an enabled
lo device. Part of it may be because eth0 is up and has an address that
can be used to do loopbacking ('local 192.168.1.15 dev eth0 proto
kernel scope host src 192.168.1.15' in `ip route list table local`).
So, one way to address this would be if kernel_connect() returns a distinctive
errno in this case (I would expect something like ENETDOWN) and then have the
RPC transport behave as if it had received ECONNREFUSED.
Are you in a position to enable RPC debugging before doing that mount? If so,
you can do
# rpcdebug -m rpc -s trans
xs_error_report client f67bb800...
error 110
xs_tcp_state change client f67bb800...
state 7 conn 0 dead 0 zapped 1
xs_tcp_send_request(44) = -118
sendmsg returned unrecognized error 110
xs_tcp_state_change client ..
[...]
disconnecting xprt f67bb800 to reuse port
[...]
worker connecting xprt f67bb800 via tcp to 127.0.0.1 (port 111)
f67bb800 connect status 115 connected 0 sock state 2
xs_tcp_send_request(88) = -11
3 xmit incomplete (88 left of 88)
and so on (repeats every 20 sec)
I'd like to see the full log captured during your test, with time
stamps. 110 is ETIMEDOUT, which suggests the network layer is not
reporting that the loopback interface is not up, but simply that the SYN
is timing out.
And if you could, "^-s trans^-s trans xprt clnt sched bind".
Thanks for your help.
--
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html