RE: [PATCH v2] MIPS: Check __clang__ to avoid performance influence with GCC in csum_tcpudp_nofold()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
> Sent: 15 March 2021 12:26
> On 03/15/2021 04:49 AM, Maciej W. Rozycki wrote:
> > On Tue, 9 Mar 2021, Tiezhu Yang wrote:
> >
> >> diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
> >> index 1e6c135..80eddd4 100644
> >> --- a/arch/mips/include/asm/checksum.h
> >> +++ b/arch/mips/include/asm/checksum.h
> >> @@ -128,9 +128,13 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
> >>
> >>   static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
> >>   					__u32 len, __u8 proto,
> >> -					__wsum sum)
> >> +					__wsum sum_in)
> >>   {
> >> -	unsigned long tmp = (__force unsigned long)sum;
> >> +#ifdef __clang__
> >> +	unsigned long sum = (__force unsigned long)sum_in;
> >> +#else
> >> +	__wsum sum = sum_in;
> >> +#endif
> >   This looks much better to me, but I'd keep the variable names unchanged
> > as `sum_in' isn't used beyond the initial assignment anyway (you'll have
> > to update the references with asm operands accordingly of course).
> >
> >   Have you verified that code produced by GCC remains the same with your
> > change in place as it used to be up to commit 198688edbf77?  I can see no
> > such information in the commit description whether here or in the said
> > commit.
> >
> >    Maciej
> 
> Hi Maciej,
> 
> Thanks for your reply.
> 
> gcc --version
> gcc (Debian 10.2.1-6) 10.2.1 20210110
> 
> net/ipv4/tcp_ipv4.c
> tcp_v4_send_reset()
>    csum_tcpudp_nofold()
> 
> objdump -d vmlinux > vmlinux.dump
> 
> (1) Before commit 198688edbf77
> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used
> with Clang"):
> 
> ffffffff80aa835c:       00004025        move    a4,zero
> ffffffff80aa8360:       92020012        lbu     v0,18(s0)
> ffffffff80aa8364:       de140030        ld      s4,48(s0)
> ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
> ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
> ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
> ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
> ffffffff80aa8378:       afa70038        sw      a3,56(sp)
> ffffffff80aa837c:       24071a00        li      a3,6656
> ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
> ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
> ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
> ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
> ffffffff80aa8390:       0081202d        daddu   a0,a0,at
> ffffffff80aa8394:       0081082b        sltu    at,a0,at
> ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
> ffffffff80aa839c:       00812021        addu    a0,a0,at
> 
> (2) After commit 198688edbf77
> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used
> with Clang"):
> 
> ffffffff80aa836c:       00004025        move    a4,zero
> ffffffff80aa8370:       92040012        lbu     a0,18(s0)
> ffffffff80aa8374:       de140030        ld      s4,48(s0)
> ffffffff80aa8378:       0062182d        daddu   v1,v1,v0
> ffffffff80aa837c:       308400ff        andi    a0,a0,0xff
> ffffffff80aa8380:       9c62000c        lwu     v0,12(v1)
> ffffffff80aa8384:       9c660010        lwu     a2,16(v1)
> ffffffff80aa8388:       afa70038        sw      a3,56(sp)
> ffffffff80aa838c:       24071a00        li      a3,6656
> ffffffff80aa8390:       0046102d        daddu   v0,v0,a2
> ffffffff80aa8394:       0047102d        daddu   v0,v0,a3
> ffffffff80aa8398:       0048102d        daddu   v0,v0,a4
> ffffffff80aa839c:       0002083c        dsll32  at,v0,0x0
> ffffffff80aa83a0:       0041102d        daddu   v0,v0,at
> ffffffff80aa83a4:       0041082b        sltu    at,v0,at
> ffffffff80aa83a8:       0002103f        dsra32  v0,v0,0x0
> ffffffff80aa83ac:       00411021        addu    v0,v0,at
> 
> (3) With this patch:
> 
> ffffffff80aa835c:       00004025        move    a4,zero
> ffffffff80aa8360:       92020012        lbu     v0,18(s0)
> ffffffff80aa8364:       de140030        ld      s4,48(s0)
> ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
> ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
> ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
> ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
> ffffffff80aa8378:       afa70038        sw      a3,56(sp)
> ffffffff80aa837c:       24071a00        li      a3,6656
> ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
> ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
> ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
> ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
> ffffffff80aa8390:       0081202d        daddu   a0,a0,at
> ffffffff80aa8394:       0081082b        sltu    at,a0,at
> ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
> ffffffff80aa839c:       00812021        addu    a0,a0,at
> 
> (4) With the following changes based on commit 198688edbf77
> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used
> with Clang"):
> 
> diff --git a/arch/mips/include/asm/checksum.h
> b/arch/mips/include/asm/checksum.h
> index 1e6c135..e1f80407 100644
> --- a/arch/mips/include/asm/checksum.h
> +++ b/arch/mips/include/asm/checksum.h
> @@ -130,7 +130,11 @@ static inline __wsum csum_tcpudp_nofold(__be32
> saddr, __be32 daddr,
>                       __u32 len, __u8 proto,
>                       __wsum sum)
>   {
> +#ifdef __clang__
>       unsigned long tmp = (__force unsigned long)sum;
> +#else
> +    __wsum tmp = sum;
> +#endif
> 
>       __asm__(
>       "    .set    push        # csum_tcpudp_nofold\n"
> 
> ffffffff80aa835c:       00004025        move    a4,zero
> ffffffff80aa8360:       92020012        lbu     v0,18(s0)
> ffffffff80aa8364:       de140030        ld      s4,48(s0)
> ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
> ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
> ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
> ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
> ffffffff80aa8378:       afa70038        sw      a3,56(sp)
> ffffffff80aa837c:       24071a00        li      a3,6656
> ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
> ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
> ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
> ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
> ffffffff80aa8390:       0081202d        daddu   a0,a0,at
> ffffffff80aa8394:       0081082b        sltu    at,a0,at
> ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
> ffffffff80aa839c:       00812021        addu    a0,a0,at
> 
> The code produced by GCC remains the same between (1), (3) and (4),
> the last changes looks like better (with less changes based on commit
> 198688edbf77), so I will send v3 later.

Aren't those all the same - apart from register selection.
Not that I grok the mips opcodes.
But that code has horridness on its side.

The only obvious difference is that something else changes the
code offset from xxxx5c to xxxx6c.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)




[Index of Archives]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux