Re: simd, redundant pcmpeqb and pxor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 7, 2022 at 2:26 PM <i.nixman@xxxxxxxxxxxxx> wrote:
>
> On 2022-11-07 03:32, Hongtao Liu wrote:
> > On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
> > <gcc-help@xxxxxxxxxxx> wrote:
> >>
> >>
> >> Hello,
> >>
> >> look at this example(https://godbolt.org/z/TnGMsfMs6):
> >> ```
> >> auto foo(const char *p) {
> >>      const auto substr = _mm_loadu_si128((const __m128i *)p);
> >>      return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
> >> }
> >> ```
> >> and to the generated asm:
> >> ```
> >> 1: foo(char const*):
> >> 2:    movdqu  xmm0, XMMWORD PTR [rdi]
> >> 3:    pxor    xmm1, xmm1
> >> 4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
> >> 5:    pcmpeqb xmm0, xmm1
> >> 6:    ret
> >> ```
> >> look at line 5.
> >> is there any reason for `pcmpeqb` instruction?
>
> hi,
>
> > Looks like a mis optimization from
> >
> > _4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
> > _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47,
> > 47, 47 };
> > _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3);  --- this?
> >
> > Could you open a bugzilla for it
> > https://gcc.gnu.org/bugzilla/
>
> sure, but for which component?
Let's put it as rtl-optimization first.
>
>
>
> >>
> >> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
> >> ```
> >> 1: foo(char const*):
> >> 2:    movdqu  xmm1, xmmword ptr [rdi]
> >> 3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
> >> 4:    pcmpgtb xmm0, xmm1
> >> 5:    ret
> >> ```
> >>
> >>
> >> best!



-- 
BR,
Hongtao



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux