Hi Neil, Was your patch 3ffbc1d6558 (net/sunrpc/xprt_sock: fix regression in connection error reporting) is an attempt to fix this problem or was it something else (latter is what I think but would like to verify)? I was wondering if there was anything done upstream that fixed this issue. On Thu, Apr 7, 2016 at 8:47 PM, NeilBrown <nfbrown@xxxxxxxxxx> wrote: > On Thu, Apr 07 2016, Richard Laager wrote: > >> >> In a separate failover event, I tested accessing NFS over TCP. I do >> *not* get "Received RST segment.". So I conclude that >> tcp_validate_incoming() is not being called. > > Thanks for all the details. The ssh experiment quite convincingly shows > that the network infrastructure is working correctly. > > The NFS experiment is strange - the RST doesn't even seem to be > arriving. Yet the tcpdump shows that it did. > >> >> Any thoughts on what that means or where to go from here? > > Working back from tcp_validate_incoming, it is called from two places. > One is tcp_rcv_state_process() which handles connections which are not > currently established, so it should be irrelevant. > The other is tcp_rcv_stablished(). > As the RST flag is set the fast-path branch will not be taken (as > ->pred_flags cannot possibly contain RST) so it should reach the > slow_path: label. The only things that can stop the code reaching > tcp_validate_incoming() is the "len" being less than 20 (which it isn't) > or the tcp checksum being wrong. > The tcpdump showed the checksum as '0', but that could be due to tcp > checksum offload. > > You could add some printks in there (After slow_path:) to report when > tcp_checksum_complete_user() fails, particularly for th->rst packets. > > Or you could try turning off tcp checksum offloading with > ethtool --offload rx off DEVICENAME > (I think). > > It might help to see a tcpdump trace of the case where the "ssh" > connection was broken successfully for comparison with the case where > the nfs connection wasn't broken. Or it might not. > > NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html