Re: GCC OpenAcc executables problems

Esteban Hernández <eshernan@xxxxxxxxx> · Fri, 15 Jan 2016 20:03:03 +0000

Alexander, I discovered the problem,

In my code i use

#pragma acc kernels loop gang(100) vector (256)

but activate GOMP_DEBUG1, the result of my execution is

 nvptx_exec: prepare mappings
  nvptx_exec: kernel main$_omp_fn$0: launch gangs=1, workers=1, vectors=32

the numbers of gangs is different.

Who i send the correct number of gangs to nvptx_exec

?

On Fri, Jan 15, 2016 at 7:22 PM, Esteban Hernández <eshernan@xxxxxxxxx> wrote:
> Poor performance,
>
> Dear Alexander,
>
> My pi example running now, but the processing time is very slow.
>
> for example if a run the executable compiled with gcc the time is
>
> real    0m7.626s
> user    0m5.574s
> sys     0m2.007s
>
>
> But if i compile the same code with pgi and get the time the result is
>
> pi=3.1415926536
>
> real    0m0.269s
> user    0m0.008s
> sys     0m0.208s
>
>
> Can you help me explain what parameter of FLAGS i need used to review
> the performance problem?
>
>
> Thanks again
>
> On Thu, Jan 14, 2016 at 4:51 PM, Alexander Monakov <amonakov@xxxxxxxxx> wrote:
>> On Thu, 14 Jan 2016, Esteban Hernández wrote:
>>
>>> On Thu, Jan 14, 2016 at 4:35 PM, Esteban Hernández <eshernan@xxxxxxxxx> wrote:
>>> > Dear alexander,
>>> >
>>> > I review the code of pi implementation  and the pi value is copyout
>>> >
>>> >
>>> >         #pragma acc data copyout (pi)
>>> >         #pragma acc parallel vector_length (vl)  reduction (+:pi)
>>> >         for (i=0; i<N; i++) {
>>> >             double t= (double)((i+0.5)/N);
>>> >             pi +=4.0/(1.0+t*t);
>>> >          }
>>> >         printf("pi=%11.10f\n",pi/N);
>>> >
>>> > But when  i run the program with strace the result is wattling forever,
>>
>> It's probably just takes a lot of time (it's running only with 1 worker&gang),
>> try decreasing N or add num_workers/num_gangs clauses in addition to
>> vector_length.
>>
>> (I'm not sure if OpenACC requires running with 1 worker and gang in your
>> example, or GCC is behaving suboptimally -- please wait for comment from
>> OpenACC implementors in GCC)
>>
>> Alexander
>
>
>
> --
> Sincerely
>
>
> Esteban Hernandez B.
> HPC specialist

-- 
Sincerely

Esteban Hernandez B.
HPC specialist