Re: [PATCH] xs_bind retry binding forever

Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> · Fri, 22 Oct 2010 13:45:36 -0400

On Fri, 2010-10-22 at 11:56 -0400, Chuck Lever wrote:
> On Oct 21, 2010, at 3:38 PM, Trond Myklebust wrote:
> 
> > On Thu, 2010-10-21 at 13:33 -0500, Ben Myers wrote:
> >> Retry bind for reserved source ports forever.  Add an error message when we
> >> have a hard time binding one.
> > 
> > NACK. This approach leads to the process spinning forever in that loop,
> > which is exactly why we introduced the limit in the first place. See all
> > the old archived bug report emails about 'rpciod taking 100% cpu'.
> 
> The root problem seems to be the hard loop.  Thinking out loud, what if the client's FSM or some other higher up layer performed the retry, with a short delay inserted after each attempt?

The problem isn't only the hard loop. The reason why we return the
EADDRINUSE is in order to allow quick failure of mounts and/or
automounts when we can't bind the socket.

I suggest 2 changes:

     1. In case of error, pass the return value from xs_bind to the
        pending tasks
     2. Add a handler for EADDRINUSE in call_status(),
        xprt_connect_status() and call_connect_status(). Make sure that
        call_status() and call_connect_status() fail for SOFTCONN tasks,
        and that they print an error message, delay and retry in the
        case of ordinary hard tasks.

Cheers
  Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html