OpenACC parallel directives fail to run, libgomp error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I manage to compile simple openACC programs with gcc on POWER8 + Nvidia K40.
But when launching the executable, I get this error:

    libgomp: Cannot map target functions or variables (expected 0, have 1)

More specifically, this error is triggered everytime the directives
    #pragma acc parallel
    #pragma acc kernels
are met at runtime.

for example this code won't work:

    #pragma acc parallel loop
    for(int i=0;i<10;i++) a[i]=0;

However I succesfully managed to use other acc functions: acc_get_num_devices, 
acc_malloc, #pragma acc enter data, #pragma acc update. They have the correct
behaviour (update for example will correctly update data on GPU).

Strangely, #pragma acc enter data crashes when there is a parallel directive in
the same code, but works otherwise. (See test.cpp for details)

My command line to compile is
gcc -fopenacc -foffload=nvptx-none main.c /lib/gcc/powerpc64le-unknown-linux-gnu/6.0.0/crtoffloadbegin.o /lib/gcc/powerpc64le-unknown-linux-gnu/6.0.0/crtoffloadend.o

The libgomp.so.1 used by the executable is the right one. I can see it with ldd.
(it is in [gcc 6 install dir]/lib64)

I installed gcc using the April 23th release candidate.
nvptx with configure --target=nvptx-none
configure --target=nvptx-none --enable-as-accelerator-for="powerpc64le-unknown-linux-gnu"
configure --enable-offload-targets=nvptx-none=$GCC6ROOT/install\

Any clue what can be wrong ?
Thanks !

Louis
// CODE

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {

int res[1000]; for(int i=0; i<1000;i++) res[i] = 100;

printf("Copy begin ...\n");
#pragma acc enter data copyin(res)
printf("Copy OK\n");

#pragma acc parallel
for(int i=0; i<1000;i++) res[i] = 200;

#pragma acc update self(res)

printf("result=%d\n",res[10]); // if res==200, that means GPU has not been correctly updated

}

// OUTPUT
/*

- If the #parallel line is commented, the code succesfully runs and prints "result=200"
(meaning data was successfully transferred on GPU)

- If the parallel is uncommented, the result is:

Copy begin ...

libgomp: Cannot map target functions or variables (expected 0, have 1)

	
	*/

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux