Hi Trond, On Feb 27 07:37 PM, Trond Myklebust wrote: > NACK. If you need a memory barrier, put it in the UDP case only (and > justify why). The TCP case sets and clears the above in the > connect/disconnect softirqs and so does not require them. > > I certainly see no reason whatsoever for adding memory barriers to > xprt_connected() or xprt_clear_connected(). > > > static inline int xprt_test_and_set_connected(struct rpc_xprt *xprt) > > Index: linux-2.6.27.15-2/net/sunrpc/xprtsock.c > > =================================================================== > > --- linux-2.6.27.15-2.orig/net/sunrpc/xprtsock.c > > +++ linux-2.6.27.15-2/net/sunrpc/xprtsock.c > > @@ -1512,14 +1512,13 @@ static void xs_udp_finish_connecting(str > > sk->sk_no_check = UDP_CSUM_NORCV; > > sk->sk_allocation = GFP_ATOMIC; > > > > - xprt_set_connected(xprt); > > - > > /* Reset to new socket */ > > transport->sock = sock; > > transport->inet = sk; > > > > xs_set_memalloc(xprt); > > > > + xprt_set_connected(xprt); > > write_unlock_bh(&sk->sk_callback_lock); > > } > > xs_udp_do_set_buffer_size(xprt); > > Please move that call to xprt_set_connected() outside the if statement. > We want to set the connected flag unconditionally here. FYI, I've been digging a bit. This commit adds the clearing of the bits: commit b6ddf64ffe9d59577a9176856bb6fe69a539f573 Author: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Thu Apr 17 18:52:19 2008 -0400 SUNRPC: Fix up xprt_write_space() <snip> @@ -588,19 +603,20 @@ static int xs_udp_send_request(struct rpc_task *task) } switch (status) { + case -EAGAIN: + xs_nospace(task); + break; case -ENETUNREACH: case -EPIPE: case -ECONNREFUSED: /* When the server has died, an ICMP port unreachable message * prompts ECONNREFUSED. */ - break; - case -EAGAIN: - xs_nospace(task); + clear_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags); break; default: + clear_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags); dprintk("RPC: sendmsg returned unrecognized error %d\n", -status); - break; } return status; This went in to 2.6.26 which was the first time we saw the bug (2.6.26.3) on kerneloops. I don't know if *this* is a bad commit or some other locking changed in 2.6.26 which tickles the bug. Hope it helps. Have a good weekend all! =a= -- =================== Aaron Straus aaron@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html