On May 28, 2009, at 2:33 AM, NeilBrown wrote:
Unregistering an RPC service is not essential - but it is tidy.
So it is unpleasant to wait a long time for it to complete.
As unregistering is always directed to localhost, the most likely
reason for any delay is the portmap (or rpcbind) is not running.
In that case, waiting it totally pointless. In any case, we should
expect a quick response, and zero packet loss.
So reduce the timeouts to a total of half a second.
A much better solution here would be to use a connected UDP socket.
This would make start-up in the absense of rpcbind/portmap fail much
quicker, we could keep the longer, safer timeout value, and the kernel
could warn about exactly what the problem is. It would also have
similar benefits for the kernel's mount and NSM clients, and when
starting up services. It would additionally make UDP RPC ping
checking fail very quickly.
This means that if we try to stop nfsd while portmap is not running,
it takes 3 seconds instead of 3.5 minutes.
Note that stopping portmap before the last nfsd thread has died is
not simply a configuration error. When we kill nfsd threads they
could take a while to die, and it is possible that portmap could
then be killed before the last nfsd thread has had a change to finish
up.
[An alternate might be to make the sunrpc code always "connect"
udp sockets so that "port not reachable" errors would get reported
back. This requires a more intrusive change though and might have
other consequences]
We had discussed this about a year ago when I started adding IPv6
support. I had suggested switching the local rpc client to use TCP
instead of UDP to solve exactly this time-out problem during start-
up. There was some resistance to the idea because TCP would leave
privileged ports in TIMEWAIT (at shutdown, this is probably not a
significant concern).
Trond had intended to introduce connected UDP socket support to the
RPC client, although we were also interested in someday having a
single UDP socket for all RPC traffic... the design never moved on
from there.
My feeling at this point is that having a connected UDP socket
transport would be simpler and have broader benefits than waiting for
an eventual design that can accommodate multiple transport instances
sharing a single socket.
Signed-off-by: NeilBrown <neilb@xxxxxxx>
---
net/sunrpc/rpcb_clnt.c | 17 +++++++++++++++--
1 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index beee6da..31f7e2e 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -132,7 +132,8 @@ static const struct sockaddr_in
rpcb_inaddr_loopback = {
};
static struct rpc_clnt *rpcb_create_local(struct sockaddr *addr,
- size_t addrlen, u32 version)
+ size_t addrlen, u32 version,
+ int unregister)
{
It might be cleaner to define the timeout structures as global
statics, and pass in a pointer to the preferred structure instead of
checking a passed-in an integer. One conditional branch instead of two.
struct rpc_create_args args = {
.protocol = XPRT_TRANSPORT_UDP,
@@ -144,6 +145,16 @@ static struct rpc_clnt
*rpcb_create_local(struct sockaddr *addr,
.authflavor = RPC_AUTH_UNIX,
.flags = RPC_CLNT_CREATE_NOPING,
};
+ const struct rpc_timeout unreg_timeout = {
+ .to_initval = HZ/2,
+ .to_maxval = HZ/2,
+ .to_increment = 0,
+ .to_retries = 0,
+ .to_exponential = 0,
+ };
+
+ if (unregister)
+ args.timeout = &unreg_timeout;
return rpc_create(&args);
}
@@ -183,10 +194,12 @@ static int rpcb_register_call(const u32
version, struct rpc_message *msg)
size_t addrlen = sizeof(rpcb_inaddr_loopback);
struct rpc_clnt *rpcb_clnt;
int result, error = 0;
+ int unregister;
msg->rpc_resp = &result;
- rpcb_clnt = rpcb_create_local(addr, addrlen, version);
+ unregister = (msg->rpc_proc->p_proc == RPCBPROC_UNSET);
+ rpcb_clnt = rpcb_create_local(addr, addrlen, version, unregister);
if (!IS_ERR(rpcb_clnt)) {
error = rpc_call_sync(rpcb_clnt, msg, 0);
rpc_shutdown_client(rpcb_clnt);
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html