On Wed, Jan 25, 2017 at 6:33 PM, Mark Zhang <bomb.zhang@xxxxxxxxx> wrote: > Hi Alex, > > Thanks for your reply. > I tested your correction. The result is correct. > The C language will cause this function(csum_tcpudp_nofold) become > 176 MIPS instructions. The assemble code is 150 MIPS instruction. > If the MIPS CPU is running as 1GHz, C language cause more 60 nano > seconds during send/receive each tcp/udp packet. I'm not sure whether > it will cause any negative result if the frequency of CPU was lower. > MIPS arch is usually used in networking equipments. > I think Ralf's correction is better. > > - Mark > > Signed-off-by: Ralf Baechle <ralf@xxxxxxxxxxxxxx> > > arch/mips/include/asm/checksum.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h > index 5c585c5..08b6ba3 100644 > --- a/arch/mips/include/asm/checksum.h > +++ b/arch/mips/include/asm/checksum.h > @@ -186,7 +186,9 @@ static inline __wsum csum_tcpudp_nofold(__be32 saddr, > " daddu %0, %4 \n" > " dsll32 $1, %0, 0 \n" > " daddu %0, $1 \n" > + " sltu $1, %0, $1 \n" > " dsra32 %0, %0, 0 \n" > + " daddu %0, $1 \n" > #endif > " .set pop" > : "=r" (sum) > This looks good to me. Acked-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxx>