Re: PROBLEM: NFS Client Ignores TCP Resets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 07 2016, Richard Laager wrote:

>
> In a separate failover event, I tested accessing NFS over TCP. I do
> *not* get "Received RST segment.". So I conclude that
> tcp_validate_incoming() is not being called.

Thanks for all the details.  The ssh experiment quite convincingly shows
that the network infrastructure is working correctly.

The NFS experiment is strange - the RST doesn't even seem to be
arriving.  Yet the tcpdump shows that it did.

>
> Any thoughts on what that means or where to go from here?

Working back from tcp_validate_incoming, it is called from two places.
One is tcp_rcv_state_process() which handles connections which are not
currently established, so it should be irrelevant.
The other is tcp_rcv_stablished().
As the RST flag is set the fast-path branch will not be taken (as
->pred_flags cannot possibly contain RST) so it should reach the
slow_path: label.  The only things that can stop the code reaching
tcp_validate_incoming() is the "len" being less than 20 (which it isn't)
or the tcp checksum being wrong.
The tcpdump showed the checksum as '0', but that could be due to tcp
checksum offload.

You could add some printks in there (After slow_path:) to report when
tcp_checksum_complete_user() fails, particularly for th->rst packets.

Or you could try turning off tcp checksum offloading with
  ethtool  --offload rx off DEVICENAME
(I think).

It might help to see a tcpdump trace of the case where the "ssh"
connection was broken successfully for comparison with the case where
the nfs connection wasn't broken.  Or it might not.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux