Hello all
I am facing an issue related to Raspberry PI 3B+ and onboard ethernet card.
When doing a huge transfer (more than 1GB) in a row, transfer hanges and
failed after a few minutes.
I have two ways to reproduce this issue
using NFS (v3 or v4)
dd if=/dev/zero of=/NFSPATH/file bs=4M count=1000 status=progress
we can see that at some point dd hangs and becomes non interrutible
(no way to ctrl-c it or kill it)
after afew minutes, dd dies and a bunch of NFS server not
responding / NFS server is OK are seens into the journal
Using SCP
dd if=/dev/zero of=/tmp/file bs=4M count=1000
scp /tmp/file user@server:/directory
scp hangs after 1GB and after a few minutes scp is failing with
message "client_loop: send disconnect: Broken pipe lostconnection"
It appears, this is a known bug relatted to TCP Segmentation Offload &
Selective Acknowledge.
disabling this TSO (ethtool -K eth0 tso off & ethtool -K eth0 gso off)
solves the issue.
A patch has been created to disable the feature by default by the
raspberry team and is by default applied wihtin raspbian.
comment from the patch :
/* TSO seems to be having some issue with Selective Acknowledge (SACK) that
* results in lost data never being retransmitted.
* Disable it by default now, but adds a module parameter to enable it for
* debug purposes (the full cause is not currently understood).
*/
For reference you can find
a link to the issue I created yesterday :
https://github.com/raspberrypi/linux/issues/3395
links to raspberry dev team :
https://github.com/raspberrypi/linux/issues/2482 &
https://github.com/raspberrypi/linux/issues/2449
If you need me to test things, or give you more informations, I ll be
pleased to help.
Fox
PS : this is a resent in with plain text because vger rejected the first
one with html formating ...:)