FWIW... your patch (below) seems to actually fix the checksum problem
in my testing. What was your concern about it?
--
-bp
On Sep 17, 2008, at 3:40 AM, Maciej W. Rozycki wrote:
On Tue, 16 Sep 2008, Bryan Phillippe wrote:
I've experimented with the following change:
--- /home/bp/tmp/csum_partial.S.orig 2008-09-16 12:01:00.000000000
-0700
+++ arch/mips/lib/csum_partial.S 2008-09-16 11:51:44.000000000 -0700
@@ -281,6 +281,23 @@
.set reorder
/* Add the passed partial csum. */
ADDC(sum, a2)
+
+ /* fold checksum again to clear the high bits before returning */
+ .set push
+ .set noat
+#ifdef USE_DOUBLE
+ dsll32 v1, sum, 0
+ daddu sum, v1
+ sltu v1, sum, v1
+ dsra32 sum, sum, 0
+ addu sum, v1
+#endif
+ sll v1, sum, 16
+ addu sum, v1
+ sltu v1, sum, v1
+ srl sum, sum, 16
+ addu sum, v1
+
jr ra
.set noreorder
END(csum_partial)
and it seems to fix the problem for me. Can you comment?
It seems obvious that a carry from the bit #15 in the last addition of
the passed checksum -- ADDC(sum, a2) -- will negate the effect of the
folding. However a simpler fix should do as well. Try if the
following
patch works for you. Please note this is completely untested and
further
optimisation is possible, but I've skipped it in this version for
clarity.
Thanks for raising the issue.
Maciej
Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxxxxx>
--- a/arch/mips/lib/csum_partial.S 2008-05-05 02:55:23.000000000
+0000
+++ b/arch/mips/lib/csum_partial.S 2008-09-17 10:32:37.000000000
+0000
@@ -253,6 +253,9 @@ LEAF(csum_partial)
1: ADDC(sum, t1)
+ /* Add the passed partial csum. */
+ ADDC(sum, a2)
+
/* fold checksum */
.set push
.set noat
@@ -278,11 +281,8 @@ LEAF(csum_partial)
andi sum, 0xffff
.set pop
1:
- .set reorder
- /* Add the passed partial csum. */
- ADDC(sum, a2)
jr ra
- .set noreorder
+ nop
END(csum_partial)