Re: [PATCH v2] MIPS: Check __clang__ to avoid performance influence with GCC in csum_tcpudp_nofold()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/15/2021 08:42 PM, David Laight wrote:
From: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
Sent: 15 March 2021 12:26
On 03/15/2021 04:49 AM, Maciej W. Rozycki wrote:
On Tue, 9 Mar 2021, Tiezhu Yang wrote:

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 1e6c135..80eddd4 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -128,9 +128,13 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)

   static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
   					__u32 len, __u8 proto,
-					__wsum sum)
+					__wsum sum_in)
   {
-	unsigned long tmp = (__force unsigned long)sum;
+#ifdef __clang__
+	unsigned long sum = (__force unsigned long)sum_in;
+#else
+	__wsum sum = sum_in;
+#endif
   This looks much better to me, but I'd keep the variable names unchanged
as `sum_in' isn't used beyond the initial assignment anyway (you'll have
to update the references with asm operands accordingly of course).

   Have you verified that code produced by GCC remains the same with your
change in place as it used to be up to commit 198688edbf77?  I can see no
such information in the commit description whether here or in the said
commit.

    Maciej
Hi Maciej,

Thanks for your reply.

gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110

net/ipv4/tcp_ipv4.c
tcp_v4_send_reset()
    csum_tcpudp_nofold()

objdump -d vmlinux > vmlinux.dump

(1) Before commit 198688edbf77
("MIPS: Fix inline asm input/output type mismatch in checksum.h used
with Clang"):

ffffffff80aa835c:       00004025        move    a4,zero
ffffffff80aa8360:       92020012        lbu     v0,18(s0)
ffffffff80aa8364:       de140030        ld      s4,48(s0)
ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
ffffffff80aa8378:       afa70038        sw      a3,56(sp)
ffffffff80aa837c:       24071a00        li      a3,6656
ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
ffffffff80aa8390:       0081202d        daddu   a0,a0,at
ffffffff80aa8394:       0081082b        sltu    at,a0,at
ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
ffffffff80aa839c:       00812021        addu    a0,a0,at

(2) After commit 198688edbf77
("MIPS: Fix inline asm input/output type mismatch in checksum.h used
with Clang"):

ffffffff80aa836c:       00004025        move    a4,zero
ffffffff80aa8370:       92040012        lbu     a0,18(s0)
ffffffff80aa8374:       de140030        ld      s4,48(s0)
ffffffff80aa8378:       0062182d        daddu   v1,v1,v0
ffffffff80aa837c:       308400ff        andi    a0,a0,0xff
ffffffff80aa8380:       9c62000c        lwu     v0,12(v1)
ffffffff80aa8384:       9c660010        lwu     a2,16(v1)
ffffffff80aa8388:       afa70038        sw      a3,56(sp)
ffffffff80aa838c:       24071a00        li      a3,6656
ffffffff80aa8390:       0046102d        daddu   v0,v0,a2
ffffffff80aa8394:       0047102d        daddu   v0,v0,a3
ffffffff80aa8398:       0048102d        daddu   v0,v0,a4
ffffffff80aa839c:       0002083c        dsll32  at,v0,0x0
ffffffff80aa83a0:       0041102d        daddu   v0,v0,at
ffffffff80aa83a4:       0041082b        sltu    at,v0,at
ffffffff80aa83a8:       0002103f        dsra32  v0,v0,0x0
ffffffff80aa83ac:       00411021        addu    v0,v0,at

(3) With this patch:

ffffffff80aa835c:       00004025        move    a4,zero
ffffffff80aa8360:       92020012        lbu     v0,18(s0)
ffffffff80aa8364:       de140030        ld      s4,48(s0)
ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
ffffffff80aa8378:       afa70038        sw      a3,56(sp)
ffffffff80aa837c:       24071a00        li      a3,6656
ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
ffffffff80aa8390:       0081202d        daddu   a0,a0,at
ffffffff80aa8394:       0081082b        sltu    at,a0,at
ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
ffffffff80aa839c:       00812021        addu    a0,a0,at

(4) With the following changes based on commit 198688edbf77
("MIPS: Fix inline asm input/output type mismatch in checksum.h used
with Clang"):

diff --git a/arch/mips/include/asm/checksum.h
b/arch/mips/include/asm/checksum.h
index 1e6c135..e1f80407 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -130,7 +130,11 @@ static inline __wsum csum_tcpudp_nofold(__be32
saddr, __be32 daddr,
                       __u32 len, __u8 proto,
                       __wsum sum)
   {
+#ifdef __clang__
       unsigned long tmp = (__force unsigned long)sum;
+#else
+    __wsum tmp = sum;
+#endif

       __asm__(
       "    .set    push        # csum_tcpudp_nofold\n"

ffffffff80aa835c:       00004025        move    a4,zero
ffffffff80aa8360:       92020012        lbu     v0,18(s0)
ffffffff80aa8364:       de140030        ld      s4,48(s0)
ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
ffffffff80aa8378:       afa70038        sw      a3,56(sp)
ffffffff80aa837c:       24071a00        li      a3,6656
ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
ffffffff80aa8390:       0081202d        daddu   a0,a0,at
ffffffff80aa8394:       0081082b        sltu    at,a0,at
ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
ffffffff80aa839c:       00812021        addu    a0,a0,at

The code produced by GCC remains the same between (1), (3) and (4),
the last changes looks like better (with less changes based on commit
198688edbf77), so I will send v3 later.
Aren't those all the same - apart from register selection.
Not that I grok the mips opcodes.
But that code has horridness on its side.

The only obvious difference is that something else changes the
code offset from xxxx5c to xxxx6c.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Hi David,

Yes, it seems no much obvious differences.
Let me wait for other feedback.

Hi Thomas and Maciej,

Is this patch necessary? If no, we can ignore it.
If yes, I will send v3 with the above (4) changes.

Thanks,
Tiezhu




[Index of Archives]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux