On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help <gcc-help@xxxxxxxxxxx> wrote: > > > Hello, > > look at this example(https://godbolt.org/z/TnGMsfMs6): > ``` > auto foo(const char *p) { > const auto substr = _mm_loadu_si128((const __m128i *)p); > return _mm_cmplt_epi8(substr, _mm_set1_epi8('0')); > } > ``` > and to the generated asm: > ``` > 1: foo(char const*): > 2: movdqu xmm0, XMMWORD PTR [rdi] > 3: pxor xmm1, xmm1 > 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip] > 5: pcmpeqb xmm0, xmm1 > 6: ret > ``` > look at line 5. > is there any reason for `pcmpeqb` instruction? Looks like a mis optimization from _4 = VIEW_CONVERT_EXPR<__v16qs>(_7); _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47 }; _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this? Could you open a bugzilla for it https://gcc.gnu.org/bugzilla/ > > just for info, clang's output(https://godbolt.org/z/MPnvEMdhr): > ``` > 1: foo(char const*): > 2: movdqu xmm1, xmmword ptr [rdi] > 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0] > 4: pcmpgtb xmm0, xmm1 > 5: ret > ``` > > > best! -- BR, Hongtao