Re: Loop Vectorization and OpenMP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14/01/13 16:04, Tim Prince wrote:
> It's a Frequently Encountered Problem.  What did 
> -ftree-vectorizer-verbose=3 produce?

Nothing.  At 5 it gave:

  27: versioning for alias required: can't determine dependence between
  *D.1967_20 and *D.1988_49
  27: mark for run-time aliasing test between *D.1967_20 and *D.1988_49
  [...]
  27: disable versioning for alias - max number of generated checks
  exceeded.

which implies that "restrict" is being clobbered.

> Part of the problem is that the OpenMP chunks won't have the 
> alignments you set carefully for the start of the array, unless the 
> loop count happens to be a multiple of number of threads times 
> unrolling factor times vector register width, thus unknown at compile
> time. It remains to be seen how much OpenMP 4.0 proposals for pragmas
> to deal with this may help. Until then, OpenMP tends to work better
> with at least 2 levels of loops, where the outer is parallelizable
> and the inner vectorizable.

Okay.  Can anyone suggest a good blocking methodology such that given

  for (int i = 0; i < n; ++i)
    // Code which uses parameters ...

where we require the parameters ... have an alignment of X 'items' (so
for 256-bit AVX registers and float types X = 32/4 = 8) yields:

  for (outer)
    for (inner)
      // Code

such that the outer loop can be hit with OpenMP and the inner loop with
auto-vectorization.

Regards, Freddie.



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux