On Sep 19, 2008, at 5:26 AM, Maciej W. Rozycki wrote:
On Fri, 19 Sep 2008, Ralf Baechle wrote:
Seriously though, I smell a caller somewhere fails to call
csum_fold() on
the result obtained from csum_partial() where it should, so it
would be
good to fix the bug rather than trying to cover it. Bryan, would
you be
able to track down the caller?
Not quite. Internally the IP stack maintains the checksum as a 32-
bit
value for performance sake. It only folds it to 16-bit when it has
to.
That's been my understanding from my little investigation yesterday
evening, but Bryan's problem has come from somewhere after all and
Atsushi-san's 32-bit addition fix didn't reportedly work while full
folding did, so I have assumed there must be some dependency somewhere
where the final folding does not happen. I have referred to the
original
report concerning SPARC64 now and it seems to narrow the problem
down to
the 32 MSBs only, so I would prefer to have any confusion cleared.
Bryan, can you please verify whether Ralf's fix works for you or not?
Ralph is correct on the usage of the checksum by the IP stack, and the
original problem report on the Sparc64 list is pretty thorough about
describing how the bug affects the stack.
original report: http://www.spinics.net/lists/sparclinux/msg00173.html
resolution: http://www.spinics.net/lists/sparclinux/msg00179.html
A synopsis is that when TCP resends a portion of a segment whose
checksum was already previously calculated (correctly), it splits the
segment into a new and shortened old segment [see net/ipv4/
tcp_output.c:tcp_fragment "Copy and checksum data tail into the new
buffer" ~ line 737ish]. The new segment has its checksum calculated
via csum_partial_copy_nocheck(), and the shortened old segment has its
checksum calculated via csum_block_sub() (for fast checksum delta).
It is the path through csum_block_sub() where the bug occurs, in which
the output from csum_block_sub() has the upper bits of the csum set
for the carry, and later when tcp_v4_send_check() is called with the
skb->csum, the csum is off-by-one. Obviously all of these need to be
in the non-hardware checksum case.
I can test a patch later today on the Cavium CN3010 64bit; I have been
running with the suboptimal but apparently effective fold-relocation
patch and the problem is completely gone, fwiw.
Also... I was thinking it would be trivial to extract the relevant
bits of the trigger from this repro and just write a small native app
to test the csum generation-and-return; I though about doing it myself
earlier, because I have a tcpdump capture of the segment that was
triggering the bug.
--
-bp