Re: rpciod process is blocked in nfs_release_page waiting for nfs_commit_inode to complete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 27 Jun 2012 19:37:03 +0000
"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:

> On Wed, 2012-06-27 at 15:28 -0400, Jeff Layton wrote:
> > On Wed, 27 Jun 2012 18:43:56 +0000
> > "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:
> > > The reason why we close these sockets is that if the attempt at aborting
> > > the connection fails, then they typically end up in a TIME_WAIT state.
> > > 
> > 
> > I'm still trying to wade through the xprtsock.c socket handling code,
> > but it looks like we currently tear down the connection in 3 different
> > ways:
> > 
> > xs_close: which basically calls sock_release and gets rid of our
> > reference to an existing socket. Most of the places where we disconnect
> > the socket use this. After this, we end up with srcport == 0 which
> > makes it pick a new port.
> > 
> > xs_tcp_shutdown: which calls ->shutdown on it, but doesn't free
> > anything. This also preserves the existing srcport.
> > 
> > xs_abort_connection: calls kernel_connect to reconnect the socket to
> > AF_UNSPEC address (effectively disconnecting it?). This also preserves
> > the srcport. I guess we use this just before reconnecting when the
> > remote end drops the connection, since we don't need to be graceful
> > about tearing anything down at that point.
> > 
> > The last one actually does reuse the same socket, so my thinking was
> > that we could extend that scheme to the other cases. If we called
> > ->shutdown on it and then reconnected it to AF_UNSPEC, would that
> > "reset" it back to a usable state?
> 
> Not that I'm aware of. The problem is that most of this stuff is
> undocumented. For instance, the AF_UNSPEC reconnect is documented only
> for UDP connections. While Linux implements it for TCP too, there is no
> spec (that I'm aware of) that explains how that should work.
> 

...and fwiw, it looks like reconnecting a TCP socket to an AF_UNSPEC
address doesn't work from userland -- you get back EINVAL. I have to
wonder if xs_abort_connection actually works as expected...

> > If there really is no alternative to freeing the socket, then the only
> > real fix I can see is to set PF_MEMALLOC when we go to create it and
> > then reset it afterward. That's a pretty ugly fix though...
> 
> Agreed...
> 

That looks basically like what Mel is doing to work around the problem,
though he only does it for xprt's that are tagged as being swapped
over. We could just make that unconditional, but the "congested" flag
scheme sounds better.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux