On Thu, Jan 26, 2017 at 07:57:49AM -0800, Alexander Duyck wrote: > Date: Thu, 26 Jan 2017 07:57:49 -0800 > From: Alexander Duyck <alexander.duyck@xxxxxxxxx> > To: Mark Zhang <bomb.zhang@xxxxxxxxx> > Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>, David Miller <davem@xxxxxxxxxxxxx>, > Alexander Duyck <aduyck@xxxxxxxxxxxx>, linux-mips@xxxxxxxxxxxxxx, > "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx> > Subject: Re: [Bug fix]mips 64bits checksum error -- csum_tcpudp_nofold > Content-Type: text/plain; charset=UTF-8 > > On Wed, Jan 25, 2017 at 6:33 PM, Mark Zhang <bomb.zhang@xxxxxxxxx> wrote: > > Hi Alex, > > > > Thanks for your reply. > > I tested your correction. The result is correct. > > The C language will cause this function(csum_tcpudp_nofold) become > > 176 MIPS instructions. The assemble code is 150 MIPS instruction. > > If the MIPS CPU is running as 1GHz, C language cause more 60 nano > > seconds during send/receive each tcp/udp packet. I'm not sure whether > > it will cause any negative result if the frequency of CPU was lower. > > MIPS arch is usually used in networking equipments. > > I think Ralf's correction is better. > > > > - Mark > > > > Signed-off-by: Ralf Baechle <ralf@xxxxxxxxxxxxxx> > > > > arch/mips/include/asm/checksum.h | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h > > index 5c585c5..08b6ba3 100644 > > --- a/arch/mips/include/asm/checksum.h > > +++ b/arch/mips/include/asm/checksum.h > > @@ -186,7 +186,9 @@ static inline __wsum csum_tcpudp_nofold(__be32 saddr, > > " daddu %0, %4 \n" > > " dsll32 $1, %0, 0 \n" > > " daddu %0, $1 \n" > > + " sltu $1, %0, $1 \n" > > " dsra32 %0, %0, 0 \n" > > + " daddu %0, $1 \n" > > #endif > > " .set pop" > > : "=r" (sum) > > > > This looks good to me. > > Acked-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxx> I've actually checked in a slightly different version that this which uses an ADDU rather than a DADDU for the second instruction added. This is because the DSRA32 ensures the 32 bit result in %0 is properly signed extended to 64 bit as required by the MIPS architecture and the ADDU then simply operates on that 32 bit %0. Ralf