Re: simd, redundant pcmpeqb and pxor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
<gcc-help@xxxxxxxxxxx> wrote:
>
>
> Hello,
>
> look at this example(https://godbolt.org/z/TnGMsfMs6):
> ```
> auto foo(const char *p) {
>      const auto substr = _mm_loadu_si128((const __m128i *)p);
>      return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
> }
> ```
> and to the generated asm:
> ```
> 1: foo(char const*):
> 2:    movdqu  xmm0, XMMWORD PTR [rdi]
> 3:    pxor    xmm1, xmm1
> 4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
> 5:    pcmpeqb xmm0, xmm1
> 6:    ret
> ```
> look at line 5.
> is there any reason for `pcmpeqb` instruction?
Looks like a mis optimization from

_4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
_3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47 };
_5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3);  --- this?

Could you open a bugzilla for it
https://gcc.gnu.org/bugzilla/

>
> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
> ```
> 1: foo(char const*):
> 2:    movdqu  xmm1, xmmword ptr [rdi]
> 3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
> 4:    pcmpgtb xmm0, xmm1
> 5:    ret
> ```
>
>
> best!



-- 
BR,
Hongtao



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux