Re: MIPS checksum bug

Atsushi Nemoto <anemo@xxxxxxxxxxxxx> · Wed, 17 Sep 2008 22:23:50 +0900 (JST)

On Wed, 17 Sep 2008 11:40:01 +0100 (BST), "Maciej W. Rozycki" <macro@xxxxxxxxxxxxxx> wrote:
> > and it seems to fix the problem for me.  Can you comment?
> 
>  It seems obvious that a carry from the bit #15 in the last addition of
> the passed checksum -- ADDC(sum, a2) -- will negate the effect of the
> folding.  However a simpler fix should do as well.  Try if the following
> patch works for you.  Please note this is completely untested and further
> optimisation is possible, but I've skipped it in this version for clarity.

Well, csum_partial()'s return value is __wsum (32-bit).  So strictly
there is no need to folding into 16-bit.

I think this bug was introduced by my commit
ed99e2bc1dc5dc54eb5a019f4975562dbef20103 ("[MIPS] Optimize
csum_partial for 64bit kernel").  This commit changed ADDC to using
daddu for 64-bit kernel and that was wrong for the last addition of
partial csum which should be 32-bit addition.

Here is my proposal fix.  Bryan, could you try it too?

diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 8d77841..8b15766 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -60,6 +60,14 @@
 	ADD	sum, v1;					\
 	.set	pop
 
+#define ADDC32(sum,reg)						\
+	.set	push;						\
+	.set	noat;						\
+	addu	sum, reg;					\
+	sltu	v1, sum, reg;					\
+	addu	sum, v1;					\
+	.set	pop
+
 #define CSUM_BIGCHUNK1(src, offset, sum, _t0, _t1, _t2, _t3)	\
 	LOAD	_t0, (offset + UNIT(0))(src);			\
 	LOAD	_t1, (offset + UNIT(1))(src);			\
@@ -280,7 +288,7 @@ LEAF(csum_partial)
 1:
 	.set	reorder
 	/* Add the passed partial csum.  */
-	ADDC(sum, a2)
+	ADDC32(sum, a2)
 	jr	ra
 	.set	noreorder
 	END(csum_partial)
@@ -681,7 +689,7 @@ EXC(	sb	t0, NBYTES-2(dst), .Ls_exc)
 	.set	pop
 1:
 	.set reorder
-	ADDC(sum, psum)
+	ADDC32(sum, psum)
 	jr	ra
 	.set noreorder