On Thu, 19 Oct 2023, Mathieu Malaterre via Gcc-help wrote:
After reading this SO post (*), I became curious as to what my gcc would do with the following code. It turns out that I cannot make any sense of the output: % gcc -fopt-info-missed -std=c11 -O3 -c generic.c generic.c:4:21: missed: couldn't vectorize loop generic.c:2:5: missed: not vectorized: relevant phi not supported: found_14 = PHI <found_5(7), 0(15)>
If you try again with a snapshot of gcc-14, it does vectorize, although the result doesn't seem as nice as what clang produces.
(in general I would also suggest adding -march=native or some recent arch)
Doing a quick search did not reveal anything meaningful. If my understanding is correct this is a basic information level (not meant for GCC developers): * https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html#index-fopt-info So my question is: what should the following sentence indicates ... missed: not vectorized: relevant phi not supported: found_14 = PHI <found_5(7), -1(15)> ...
Some information is hard to translate to a user-understandable language that refers to source code, this deep in the optimization, although in this case the message is not very informative even for a gcc dev. -fdump-tree-vect-details generates a file with more information, but that can be hard to understand if you are not used to it.
In this loop, some variables are 16 bits (haystack, needle), while the reduction variable is 32 bits, and gcc has a hard time vectorizing mixed sizes (and it doesn't realize that 'found' could be narrowed). If you declare 'found' as 'short' instead, something different happens.
% cat generic.c #include <stdint.h> int hasmatch(uint16_t needle, const uint16_t haystack[4]) { int found = 0; for (int i = 0; i < 4; ++i) { if (needle == haystack[i]) { found = 1; } } return found; } (*) https://stackoverflow.com/questions/74803190/fastest-way-to-find-16bit-match-in-a-4-element-short-array Thanks !
-- Marc Glisse