Dear to all, Currently i compile gcc version 6.2.1 20161117 and i'm running very simple code that use 32 gangs, I compile it with gcc -fopenacc -o3 -o piexample.x piexample.c with the following env. export GOMP_DEBUG=1 export ACC_DEVICE_TYPE=nvidia The problem is that anyone number of gangs that i use, the result is always the same: only 1 gang. GOACC_parallel_keyed: mapnum=1, hostaddrs=0x7fffc872dff0, size=0x6012c0, kinds=0x6012b8 nvptx_exec: prepare mappings nvptx_exec: kernel main$_omp_fn$0: launch gangs=1, workers=1, vectors=32 nvptx_exec: kernel main$_omp_fn$0: finished GOACC_data_end: restore mappings GOACC_data_end: mappings restored If compile it with PGI compiler the number of gangs is setter correctly, main: 12, Generating copy(pi) 14, Loop is parallelizable Accelerator kernel generated Generating Tesla code 14, #pragma acc loop gang(32), worker(8), vector(8) /* blockIdx.x threadIdx.y threadIdx.x */ 16, Generating implicit reduction(+:pi) Some ideas about the problem ? Thanks in advance -- Sincerely Esteban Hernandez B. HPC specialist