On Jun 11, 2009, at 12:48 AM, Neil Brown wrote:
On Thursday May 28, chuck.lever@xxxxxxxxxx wrote:
On May 28, 2009, at 2:33 AM, NeilBrown wrote:
[An alternate might be to make the sunrpc code always "connect"
udp sockets so that "port not reachable" errors would get reported
back. This requires a more intrusive change though and might have
other consequences]
We had discussed this about a year ago when I started adding IPv6
support. I had suggested switching the local rpc client to use TCP
instead of UDP to solve exactly this time-out problem during start-
up. There was some resistance to the idea because TCP would leave
privileged ports in TIMEWAIT (at shutdown, this is probably not a
significant concern).
Trond had intended to introduce connected UDP socket support to the
RPC client, although we were also interested in someday having a
single UDP socket for all RPC traffic... the design never moved on
from there.
My feeling at this point is that having a connected UDP socket
transport would be simpler and have broader benefits than waiting for
an eventual design that can accommodate multiple transport instances
sharing a single socket.
The use of connected UDP would have to be limited to known-safe cases
such as contacting the local portmap. I believe there are still NFS
servers out there that - if multihomed - can reply from a different
address to the one the request was sent to.
I think I advocated for adding an entirely new transport capability
called CUDP at the time. But this is definitely something to remember
as we test.
If a new transport capability is added, at this point we would likely
need some additional logic in the NFS mount parsing logic to expose
such a transport to user space. So, leaving that parsing logic alone
should insulate the NFS client from the new transport until we have
more confidence.
And we would need to check that rpcbind does the right thing. I
recently discovered that rpcbind is buggy and will sometimes respond
from the wrong interface - I suspect localhost addresses are safe, but
we would need to check, or fix it (I fixed that bug in portmap (glibc
actually) 6 years ago and now it appears again in rpcbind - groan!).
Details welcome. We will probably need to fix libtirpc.
How hard would it be to add (optional) connected UDP support? Would
we just make the code more like the TCP version, or are there any
gotchas that you know of that we would need to be careful of?
The code in net/sunrpc/xprtsock.c is a bunch of transport methods,
many of which are shared between the UDP and TCP transport
capabilities. You could probably do this easily by creating a new
xprt_class structure and a new ops vector, then reuse as many UDP
methods as possible. The TCP connect method could be usable as is,
but it would be simple to copy-n-paste a new one if some variation is
required. Then, define a new XPRT_ value, and use that in
rpcb_create_local().
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html