I manage to compile simple openACC programs with gcc on POWER8 + Nvidia K40.
But when launching the executable, I get this error:
libgomp: Cannot map target functions or variables (expected 0, have 1)
More specifically, this error is triggered everytime the directives
#pragma acc parallel
#pragma acc kernels
are met at runtime.
for example this code won't work:
#pragma acc parallel loop
for(int i=0;i<10;i++) a[i]=0;
However I succesfully managed to use other acc functions: acc_get_num_devices,
acc_malloc, #pragma acc enter data, #pragma acc update. They have the correct
behaviour (update for example will correctly update data on GPU).
Strangely, #pragma acc enter data crashes when there is a parallel directive in
the same code, but works otherwise. (See test.cpp for details)
My command line to compile is
gcc -fopenacc -foffload=nvptx-none main.c /lib/gcc/powerpc64le-unknown-linux-gnu/6.0.0/crtoffloadbegin.o /lib/gcc/powerpc64le-unknown-linux-gnu/6.0.0/crtoffloadend.o
The libgomp.so.1 used by the executable is the right one. I can see it with ldd.
(it is in [gcc 6 install dir]/lib64)
I installed gcc using the April 23th release candidate.
nvptx with configure --target=nvptx-none
configure --target=nvptx-none --enable-as-accelerator-for="powerpc64le-unknown-linux-gnu"
configure --enable-offload-targets=nvptx-none=$GCC6ROOT/install\
Any clue what can be wrong ?
Thanks !
Louis
// CODE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
int res[1000]; for(int i=0; i<1000;i++) res[i] = 100;
printf("Copy begin ...\n");
#pragma acc enter data copyin(res)
printf("Copy OK\n");
#pragma acc parallel
for(int i=0; i<1000;i++) res[i] = 200;
#pragma acc update self(res)
printf("result=%d\n",res[10]); // if res==200, that means GPU has not been correctly updated
}
// OUTPUT
/*
- If the #parallel line is commented, the code succesfully runs and prints "result=200"
(meaning data was successfully transferred on GPU)
- If the parallel is uncommented, the result is:
Copy begin ...
libgomp: Cannot map target functions or variables (expected 0, have 1)
*/